Regular expression syntax opened the mystery
Support for multiple platforms
The first regular expression is by the mathematician Stephen Kleene in 1956 that he is in the incremental natural language based on the results of research put forward. With a complete regular expression syntax used in the form of matching characters, the melt was applied to the field of information technology. Since then, several regular expression after the development period, and now has been the standard ISO (International Standards Organization) approved and that the Open Group organizations.
Regular expression is not a special language, but it can be used in a document or find and alternative characters in a standard text. It has two standards: a basic regular expression (BRE), extended regular expression (ERE). ERE, including BRE other functions and other concepts.
Many procedures are used in regular expressions, including xsh, egrep, sed, vi and procedures under the Unix platform. They can be adopted by many languages such as HTML and XML, these are usually adopted only a subset of the standard.
Than you might have imagined even ordinary
With regular expressions transplanted into cross-platform programming language, this function has become more complete, the use of a wide range gradually. On the network to use its search engine, e-mail programs use it, even if you are not a UNIX programmer, you can also use the rules of language to simplify the procedures and shorten your your development time.
Regular expressions 101
A lot of regular expression syntax looks very similar, it is because you do not you have studied them. RE wildcard is a structure type, that is, to repeat the operation. Let us first take a look at the ERE of the most common standard of basic grammar types. In order to provide examples of specific purposes, I will use several different procedures.
Regular expressions to determine the key lies in matching you to search for things, if not the concept, Res will be useless.
Each expression includes the need to find the instructions, such as shown in Table A.
Table A: Character-matching regular expressions
Match any one character
grep. ord sample.txt
Will match "ford", "lord", "2ord", etc. In the file sample.txt.
Match any one character listed between the brackets
grep [cng] ord sample.txt
Will match only "cord", "nord", and "gord"
Match any one character not listed between the brackets
grep [^ cn] ord sample.txt
Will match "lord", "2ord", etc. But not "cord" or "nord"
grep [a-zA-Z] ord sample.txt
Will match "aord", "bord", "Aord", "Bord", etc.
grep [^ 0-9] ord sample.txt
Will match "Aord", "aord", etc. But not "2ord", etc.
Repeat operator, orare described to find a specific number of characters. They are often used to match characters to find multi-line syntax of the characters, can refer to table B.
Table B: Regular expression repetition operators
Match any character one time, if it exists
egrep "? erd" sample.txt
Will match "berd", "herd", etc. And "erd"
Match declared element multiple times, if it exists
egrep "n. * rd" sample.txt
Will match "nerd", "nrd", "neard", etc.
Match declared element one or more times
egrep "[n] + erd" sample.txt
Will match "nerd", "nnerd", etc., But not "erd"
Match declared element exactly n times
egrep "[az] (2) erd" sample.txt
Will match "cherd", "blerd", etc. But not "nerd", "erd", "buzzerd", etc.
Match declared element at least n times
egrep ". (2,) erd" sample.txt
Will match "cherd" and "buzzerd", but not "nerd"
Match declared element at least n times, but not more than N times
egrep "n [e] (1,2) rd" sample.txt
Will match "nerd" and "neerd"
Anchor refers to it to match the format, as shown in Figure C. You find it convenient to use universal characters combined. For example, I used the vi command line editor: s to represent the substitute, the basic syntax of this command is:
s / pattern_to_match / pattern_to_substitute /
Table C: Regular expression anchors
Match at the beginning of a line
s / ^ / blah /
Inserts "blah" at the beginning of the line
Match at the end of a line
s / $ / blah /
Inserts "blah" at the end of the line
Match at the beginning of a word
s / \ </ blah /
Inserts "blah" at the beginning of the word
egrep "\ <blah" sample.txt
Matches "blahfield", etc.
Match at the end of a word
s / \> / blah /
Inserts "blah" at the end of the word
egrep "\> blah" sample.txt
Matches "soupblah", etc.
Match at the beginning or end of a word
egrep "\ bblah" sample.txt
Matches "blahcake" and "countblah"
Match in the middle of a word
egrep "\ Bblah" sample.txt
Matches "sublahper", etc.
Res another is to be interval (or insert) symbol. In fact, this symbol is equivalent to an OR statement on behalf of | symbols. Statement to return the following documents sample.txt the "nerd" and "merd" handle:
egrep "(n | m) erd" sample.txt
Interval very powerful, especially when you find time to document the different spelling, but you can be in the following example the same results:
egrep "[nm] erd" sample.txt
When you use the interval function of the advanced features and Res connected, it's really useful to reflect more.
Some reservations about the characters
Res the final one of the most important characteristic is to retain the character (also called specific characters). For example, if you want to find "ne * rd" and "ni * rd" characters to match the format statement "n [ei] * rd" and "neeeeerd" and "nieieierd" line, but not you want to find characters. Because the '*' (asterisk) is a reserved characters, you must use a backslash to replace it, that is: "n [ei] \ * rd". Other reserved characters include:
$ (Dollar sign)
) (Right parenthesis)
+ (Plus symbol)
? (Question mark)
((Left curly bracket, or left brace)
Once you put these characters, including characters in your search, there is no doubt Res become very difficult to pronounce. For example, in the following php code eregi search engine it is hard to read.
eregi ("^[_ a-z0-9-] + (\. [_a-z0-9-]+)*@[ a-z0-9-] + (\. [a-z0-9-] +) *$",$ sendto)
You can see it is very difficult to grasp the intent of the procedure. But if you put aside their reservations about the characters, you often misunderstand the meaning of the code.
In this paper, we opened a regular expression the mystery, and a list of common grammar ERE standards. If you want to read the rules of the Open Group organizations complete description, you can see: Regular Expressions, welcome you to discuss areas in which questions or express your point of view.
php other Articles
- PHP script of 8 skills (5) of the user authentication using PHP
- PHP script of 8 skills (6) PHP and COM
- PHP-Web Application Development: Using templates
- Chinese name of Chinese characters to allow development
- Regular expression syntax opened the mystery
- Template used to deal with the phplib7.2 multiple nested BLOCK
- The use of combination of PHP and HTML form to visit more than a single form value
- PHP development with robust code (a): Introduction from a strategically advantageous position
- PHP development with robust code (b): effective use of variable
- PHP with expat Analysis Toolkit XML
- PHP code used to achieve functional
- Described in the PHP-point method using "mapping" in Chinese
- PHP development with robust code (c): the preparation of reusable function
- SMARTY template engine
- PHP security and related
- PHP function used to solve SQL injection
- MVC model to achieve the PHP
- Class on the PHP in the views of a few individuals
- php and php code optimization related issues summary
- There are several security vulnerabilities PHPShop
Can't Find What You're Looking For?
Rating: Not yet rated