Php – Elegant way to search for UTF-8 files with BOM


For debugging purposes, I need to recursively search a directory for all files which start with a UTF-8 byte order mark (BOM). My current solution is a simple shell script:

find -type f |
while read file
    if [ "`head -c 3 -- "$file"`" == $'\xef\xbb\xbf' ]
        echo "found BOM in: $file"

Or, if you prefer short, unreadable one-liners:

find -type f|while read file;do [ "`head -c3 -- "$file"`" == $'\xef\xbb\xbf' ] && echo "found BOM in: $file";done

It doesn't work with filenames that contain a line break,
but such files are not to be expected anyway.

Is there any shorter or more elegant solution?

Are there any interesting text editors or macros for text editors?

Best Solution

What about this one simple command which not just finds but clears the nasty BOM? :)

find . -type f -exec sed '1s/^\xEF\xBB\xBF//' -i {} \;

I love "find" :)

Warning The above will modify binary files which contain those three characters.

If you want just to show BOM files, use this one:

grep -rl $'\xEF\xBB\xBF' .