The tests weren't exhaustive, just simple comparison tests:
(time (repeat 1000 (map1 [* _ 3] '(1 2 3 4 5))))
Comparing the different implementations of map1, the one using the definition of fold (foldr) at the top of the page was the fastest, followed by Arc's version, followed by Anarki's version (using the definitions of flip, foldl and foldr in rntz's comment).
These informal tests were repeated on best and rev with the same results. I honestly don't know why it's faster.
No matter the number of tests, though, the difference is only a millisecond or so, which is negligible as the size of the input increases or if the mapped function does nontrivial work.
However, I've also tried the non-tail-recursive fold on very large lists, and it doesn't appear to blow the stack. mzscheme probably does something to prevent this from happening, like allocating "stack" frames on the heap (a standard technique for implementing call/cc without stack-copying, and if your GC is good it's not much of a penalty). Given this, it seems more defensible that arc.arc's 'map1 et al are non-tail-recursive. I think I may change lib/util's foldr to use the OP's implementation.