B (1e@sdZddlZddlZddlZddlZddlmZddlmZddl Z e dej ej BjZejdkZGdddZGd d d ejZGd d d eZd dZddZGdddejZdS)a Try to detect suspicious constructs, resembling markup that has leaked into the final output. Suspicious lines are reported in a comma-separated-file, ``suspicious.csv``, located in the output directory. The file is utf-8 encoded, and each line contains four fields: * document name (normalized) * line number in the source document * problematic text * complete line showing the problematic text in context It is common to find many false positives. To avoid reporting them again and again, they may be added to the ``ignored.csv`` file (located in the configuration directory). The file has the same format as ``suspicious.csv`` with a few differences: - each line defines a rule; if the rule matches, the issue is ignored. - line number may be empty (that is, nothing between the commas: ",,"). In this case, line numbers are ignored (the rule matches anywhere in the file). - the last field does not have to be a complete line; some surrounding text (never more than a line) is enough for context. Rules are processed sequentially. A rule matches when: * document names are the same * problematic texts are the same * line numbers are close to each other (5 lines up or down) * the rule text is completely contained into the source line The simplest way to create the ignored.csv file is by copying undesired entries from suspicious.csv (possibly trimming the last field.) Copyright 2009 Gabriel A. Genellina N)nodes)Builderz ::(?=[^=])| # two :: (but NOT ::=) :[a-zA-Z][a-zA-Z0-9]+| # :foo `| # ` (seldom used by itself) (?tsz7CheckSuspiciousMarkupBuilder.finish..zFound %s/%s unused rules: %sr%css|]}t|VqdS)N)repr)r-r.r r r ysz6CheckSuspiciousMarkupBuilder.finish..)rulesloggerwarninglenr)r Z unused_rulesr r r finishssz#CheckSuspiciousMarkupBuilder.finishcCs ||||s||||dS)N) is_ignored report_issue)r r rrr r r check_issue~sz(CheckSuspiciousMarkupBuilder.check_issuecCsd|j}xX|jD]N}|j|krq|j|kr*q|j|kr6q|jdk rTt|j|dkrTqd|_dSWdS)z/Determine whether this issue should be ignored.NTF)rr2rr rabsr )r r rrrr.r r r r7s     z'CheckSuspiciousMarkupBuilder.is_ignoredc Csd|_||||tr4|jd|j|||fnB|jd|jtd||td| tdfd|j _ dS)NTz[%s:%d] "%s" found in "%-.120s"replace) r)write_log_entrypy3r3r4rencodesysgetdefaultencodingstripZappZ statuscode)r textrrr r r r8sz)CheckSuspiciousMarkupBuilder.report_issuecCstr>t|jd}t|t}||j|||g| nJt|jd}t|t}||j d|| d| dg| dS)NaZabzutf-8) r?rrcsvwriterrZwriterowrrCrr@)r rrrDfrGr r r r>s     z,CheckSuspiciousMarkupBuilder.write_log_entryc Cs|jjdddg|_}ytr,t|d}n t|d}Wntk rLdSXxtt|D]\}}t |dkrt d||d|f|\}}}} |rt |}nd}ts| d }| d }| d } t |||| } || q^W||jd t |jdS) zLoad database of previously ignored issues. A csv file, with exactly the same format as suspicious.csv Fields: document name (normalized), line number, issue, surrounding text zloading ignore rules... r=)ZnonlrrbNzwrong format in %s, line %d: %szutf-8zdone, %d rules loaded)r3infor2r?rIOError enumeraterFreaderr5 ValueErrorintdecoderappendr) r filenamer2rHirowrrrrDr.r r r rs0        z'CheckSuspiciousMarkupBuilder.load_rules)N)rrrrnamesphinxutilZloggingZ getLoggerr3r"r$r'r(r,r6r9r7r8r>rr r r r rTs  rcCs&d}x|dkr |r |j}|j}qW|S)z*Obtain line number information for a node.N)parentr )noderr r r get_linenos  r\cCs:|dd|d}|d|}|dkr.t|}|||S)a text may be a multiline string; extract only the line containing the given character index. >>> extract_line("abc defgh i", 6) >>> 'defgh' >>> for i in (0, 2, 3, 4, 10): ... print extract_line("abc defgh i", i) abc abc abc defgh defgh i rrr=)rfindfindr5)rDindexpqr r r extract_lines  rbc@s4eZdZdZddZddZeZddZdd Zd S) r*rcCstj||||_dS)N)rGenericNodeVisitorrbuilder)r Zdocumentrdr r r rszSuspiciousVisitor.__init__cCst|tjtjfr|}tt|p&d|j|_}t}xPt |D]D}| }t || }||f|krD|j ||||||fqDWdS)Nr) isinstancerZTextZimageZastextmaxr\ lastlinenoset detect_allgrouprbstartrdr9add)r r[rDrseenmatchrr r r r default_visits zSuspiciousVisitor.default_visitcCs d|_dS)Nr)rg)r r[r r r visit_documentsz SuspiciousVisitor.visit_documentcCs tjdS)N)rZSkipNode)r r[r r r visit_commentszSuspiciousVisitor.visit_commentN) rrrrgrroZ unknown_visitrprqr r r r r*s  r*)rrrerFrAZdocutilsrZsphinx.buildersrZ sphinx.utilrXcompileUNICODEVERBOSEfinditerri version_infor?rZexcelrrr\rbrcr*r r r r *s"