Unicode Category Classes

Unicode defines a number of "categories", which can be referenced with "\p{Code}" and "\P{Code}", using either one or two letter codes to represent which category of characters they belong to.

For details of which characters are matched, consult the documentation for java.lang.Character or the Unicode Category details.

Code	Description
C	all control chars
Cc	cntrl
Cf	format
Cn	unassigned
Co	private use
Cs	surrogate
L	all letters
L1	Latin-1
LD	letter or digit
Ll	lowercase letter
Lm	modifier letter
Lo	other letter
Lt	titlecase letter
Lu	uppercase letter
M	all mark
Mc	combining spacing mark
Me	enclosing mark
Mn	non spacing mark
N	all numbers
Nd	decimal digit number
Nl	letter number
No	other number
P	all punctuation
Pc	connector punctuation
Pd	dash punctuation
Pe	end punctuation
Po	other punctuation
Ps	start punctuation
S	all symbols
Sc	currency symbol
Sk	modifier symbol
Sm	math symbol
So	other symbol
Z	all separators
Zl	line separator
Zp	paragraph separator
Zs	space separator