[Haskell-cafe] Problem with HXT `when`

Vlatko Basic vlatko.basic at gmail.com
Sat Sep 21 11:13:48 CEST 2013


Hello Cafe,

I have this HTML structure:
...
<table>
     ...
     <tr>
         <th>Caption</th>
         <td>
             <a href="...">Want this</a>
             <a href="...">And this</a>
         </td>
      </tr>
      <tr>
          <th>Another caption</th>
             <td>
              ....
       <tr>
           <th>Yet another caption</th>
       ...
</table>
...

I'd like to extract A texts from row with header "Caption", and have come up 
with this

runX $ doc
     >>> (deep (hasName "tr")                                       -- filter 
only TRs
                >>> withTraceLevel 5 traceTree                   -- shows correct TR
                `when`
              deep (
                 hasName "th" >>>                                       -- 
filter THs with specified text
                 getChildren >>> hasText (=="Caption")
              ) -- inner deep
              >>> getChildren >>> hasName "td" -- shouldn't here be only one TR?
              >>> getChildren
           )
          >>> getName &&& (getChildren >>> getText)  -- list has TDs from all 
three TRs

Tried with `guards` but getting the same result.


I know there are other packages that might solve this in another way, but I'd 
like to understand what is going on here.

br,

vlatko






More information about the Haskell-Cafe mailing list