- 论坛徽章:
- 0
|
*************************
追逐 CSV
*************************
作者在第五章 只是 随口 CSV,然后不完整。然后,
作者在第六章又开始 讲 CSV,这次不能放过,
要解析的文件
Details,Month,Amount
Mid Bonus,June,"$2,000"
,January,"""zippo"""
Total Bonuses,"","$5,000"
语法如下,
这里面就要开始理解了,
Note that we’ve introduced an extra rule called hdr for clarity. Grammatically it’s just a row, but we’ve made its role clearer by separating it out.
Compare this to using just row+ or row row* on the right-side rule file.
file 本来可以定义为 file : row row+; 但是 作者 定义 为 file: hdr row+; 然后再 定义 hdr: row;
不清楚,可能要实际操作一把,看看效果。
'\r'? '\n' 可以看做 固定格式,表示 新的一行,
下面的用法非常多,
NEWLINE:'\r'? '\n' ; // return newlines to parser (is end-statement signal)
语法规矩 就是 file hdr row filed
剩下两个 TEXT STRING 是词法规则。
TEXT tokens are a sequence of characters until we hit the next comma field separator or the end of the line.
TEXT : ~[,\n\r"]+ ; 这个词法规则基本可以认定不变
Strings are any characters in between double quotes.
STRING : '"' ('""'|~'"')* '"' ; // quote-quote is an escaped quote
格式比较麻烦,作者是这样解释的,To get a double quote inside a double-quoted string, the CSV format generally
uses two double quotes in a row. That’s what the ('""'|~'"')* subrule does in rule STRING.
***************************************
把语法文件修改如下,
运行结果 和 用 hdr 的一样,感觉比用hdr还简单易懂,不理解这里用hdr,作者到底想说明什么? |
|