Let's Read In a Picture of a Park

Now that we can create SVG pictures in Haskell, next thing we'll want to do is read them from other programs. Below is a picture of Stanton Park in Washington DC (grab the file here) I drew in InkScape.

Here's the code that will load that picture into Haskell: (again, just stick it to the bottom of the previous code)

  
  let readPoint :: String -> Point
      readPoint s | Just [x,y] <- matchRegex (mkRegex "([0-9.]+),([0-9.]+)") s = (read x,read y)

  let readPolygon :: String -> Polygon
      readPolygon = (map readPoint).(splitRegex $ mkRegex " L ")

  let readPolygons :: String -> [Polygon]
      readPolygons = (map readPolygon).tail.(splitRegex $ mkRegex "<path")

  park_data <- readFile "park.svg" 
                      
  let park = readPolygons park_data

  writeFile "tut1.svg" $ writePolygons (green park)

Here's what tut1.svg looks like- Notice that only the outlines of the polygons are colored once the park data has made a pass through our program- There's no need in this simple tutorial to track the fill colors separately from the outlines, so we'll just leave those colorless:

This code is pretty much structured the same way as the code that did the writing of the SVG. In this case we're using regular expressions to split the data into parts that have each polygon. The SVG format is actually very complex, so this code takes some liberties in the format and may fail on some SVG files- For the purposes of loading some SVG maps into our program, though, it's great!

What's good about this code?

This code illustrates another situation where Haskell's laziness really makes things easy for us: All the regular expression functions just take regular 'ol text strings... If you've ever used regular expression libraries in other languages, you may remember they usually use streams or ports or file handles. Haskell just uses strings. Some of you may protest "Hey! what if you're reading in a 2GB file? Are you just going to read that into a string and then parse it? That would be suicide!"

In most languages, this would indeed be suicide, but not in Haskell: Because it's a lazy language, it won't actually read in any data from the file it doesn't feel it needs- So, theoretically, this code would be just as efficient as reading a regular expression from a stream! Additionally, if you aren't using the park_data text string for anything else, the language will probably garbage collect the "front end" of the file as well. So, in theory, we could search through a 2GB file in Haskell in this incredibly simple manner and still maintain a memory footprint that is comparable to that of a stream-based regular expressions library!