I've read the documentation and can't find any examples.
http://golang.org/pkg/unicode/#IsPunct
Is there a place in the documentation that explicitly lists all characters in these categories? I'd like to see what characters are contained in category P or category M.
It's not in the documentation, but you can still read the source code. The categories you're talking about are defined in this file: http://golang.org/src/pkg/unicode/tables.go
For example, the P
category is defined this way:
2029 var _P = &RangeTable{
2030 R16: []Range16{
2031 {0x0021, 0x0023, 1},
2032 {0x0025, 0x002a, 1},
2033 {0x002c, 0x002f, 1},
2034 {0x003a, 0x003b, 1},
2035 {0x003f, 0x0040, 1},
2036 {0x005b, 0x005d, 1},
2037 {0x005f, 0x007b, 28},
...
2141 {0xff5d, 0xff5f, 2},
2142 {0xff60, 0xff65, 1},
2143 },
2144 R32: []Range32{
2145 {0x10100, 0x10102, 1},
2146 {0x1039f, 0x103d0, 49},
2147 {0x10857, 0x1091f, 200},
...
2157 {0x12470, 0x12473, 1},
2158 },
2159 LatinOffset: 11,
2160 }
And here is a simple way to print all of them:
var p = unicode.Punct.R16
for _, r := range p {
for c := r.Lo; c <= r.Hi; c += r.Stride {
fmt.Print(string(c))
}
}