-
Notifications
You must be signed in to change notification settings - Fork 9
Expand file tree
/
Copy pathHOWTO
More file actions
102 lines (76 loc) · 4.66 KB
/
HOWTO
File metadata and controls
102 lines (76 loc) · 4.66 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
Hello. This is a guide for writing patterns for l7-filter.
*** File Format ***
A pattern file consists of the name of the protocol on one line followed
by a regular expression defining that protocol on one line. Lines
starting with # and blank lines are ignored. At the moment, the name of the
file must match the protocol defined in it. i.e. if the protocol is "ftp",
the file must be called "ftp.pat".
*** Regular Expressions ***
At the moment, l7-filter uses V8 regular expresions (see
http://www.hmug.org/man/3/regsub.html). In addition, we have added the
capability to do perl-style hex matches using the \xHH notation (i.e. to
match a tab, use \x09). Note that our regexps are more limited than the
ones you may be used to (for instance, grep's). Notably, you CANNOT use
bounds ("foo{3}" to match foofoofoo), character classes ("[[:punct:]]"
to match any punctuation) or backreferences.
If you want to check for printable characters, including whitespace,
[\x09-\0d -~] is a reasonable way of doing it.
l7-filter is case insensitive. Using upper case in your patterns is
identical to using lower case. This is largly because our regexp
implementation doesn't have a case-insensitivity flag.
l7-filter strips out nulls (0x00 bytes) in the packets it reads so that
it can treat them as normal C strings. Therefore, you can't match on
nulls. Also, fields may appear shorter than expected. For example, if
a protocol has a 4 byte field and any of those bytes can be null, the
field can appear to be any length from 0 to 4.
*** What The Classifier Sees ***
If you have setup your computer as recommended, the data to be matched
is that of both the client and the server, in the order that it passes
through the computer. For instance, in FTP, the first thing the filter
sees is "221 server ready", then "USER bob", then "331 send password",
then "PASS frogbeard", and so on.
At the moment, only the QoS version of l7-filter can match across
packets. Therefore, if you are using the QoS version, both
"220.*ftp|331.*password" and "220.*USER.*331" will match. If you are
using the Netfilter version, only the first will match. Hopefully,
cross-packet matching will be added to the Netfilter version soon.
*** What Makes A Good Pattern ***
There are two general guidelines:
1) A pattern must be neither too specific nor not specific enough.
Example 1: The pattern "bear" for Bearshare is not specific enough.
This pattern could match a wide variety of non-Bearshare connections.
For instance a HTTP request for http://bear.com would be matched.
Example 2: "220 .*ftp.*(\[.*\]|\(.*\))" for FTP is too specific.
Not all servers send ()s or []s after their 220. In fact, servers
are not even required to send the string "ftp" at any time, but the
vast majority do. Good judgement and testing are necessary for
instances such as this.
2) It should use a minimum of processing power. Thus, if it is possible
to reduce the number of *'s, +'s and |'s in your pattern, you should do
so. In particular, ".*" really slows things down. For instance, this
pattern for gopher is bad. Its use causes very noticable slowdowns on
my machine (i.e. everything locks up for 1 second or more).
[1-9,\+TgI].*\x09.*\x09.*[a-z].*\..*[a-z][a-z]\x09[1-9]
In fact, with versions 0.3.0 and forward of the kernel patch (which look
at multiple packets at once), all those .*'s can be so slow as to make
you have to reboot your machine. So be careful! This pattern is better,
it makes some reasonable assumptions about what's likely to be in those
freeform fields rather than being totally permissive:
[1-9,\+TgI][\x09-\0d -~]*\x09[\x09-\0d -~]*\x09[a-z0-9\.]*\.[a-z][a-z].?.?\x09[1-9]
Remember, if you're only expecting printable characters, don't use ".*",
use "[\x09-\x0d -~]*"!
The recommended procedure for writing packets is this:
1) Find and read the spec for the protocol you wish to match. If it's
an Internet standard, http://rfc-editor.org is a good place to start,
although not all standards are RFCs. If it is a proprietary protocol,
it is likely that someone has written a reverse-engineered spec for it
anyway. Do a general web search to find it. Skipping this step is a
good way to write patterns that are overly specific!
2) Use Ethereal (http://www.ethereal.com/) to watch packets of this
protocol go by in a typical session of its use. (If you failed to find
a spec for your protocol, but Ethereal can parse it, reading the
Ethereal source code may also be worth your time.)
3) Write a pattern that will reliably match one of the first few packets
that are sent in your protocol. Test it.
4) Send your pattern to l7-filter-developers@lists.sf.net for it to be
incorporated into the official pattern definitions.