bom

strip UTF-8 byte order marks

The bom package provides a convenient way to strip UTF-8 byte order marks (BOM) from the beginning of a byte slice or an io.Reader.

The Unicode Standard defines UTF-8 byte order marks as the byte sequence 0xEF,0xBB,0xBF, but neither requires nor recommends their use. The Go standard library provides no support for UTF-8 byte order marks, and it looks like it never will. To quote Andy Balholm in the discussion on this issue at https://groups.google.com/forum/#!topic/golang-nuts/OToNIPdfkks

The Go team includes the original designers of UTF-8, and they consider BOMs an aBOMination. They are reluctant to do anything to make life easier for people who use BOMs. :-)

(Although they did make the compiler accept source files with BOMs, if I remember right.)

In the same discussion thread another participant makes the comment that it should not be difficult to write an io.Reader that eats the BOM.

It isn't difficult, and here is one simple implementation.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
bom.go		bom.go
bom_test.go		bom_test.go
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bom

strip UTF-8 byte order marks

About

Releases 2

Packages

Languages

License

spkg/bom

Folders and files

Latest commit

History

Repository files navigation

bom

strip UTF-8 byte order marks

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages