adamcrussell

Perl Weekly Challenge 017

Parsing with Parse::Yapp and using the Bhagavad Gita API.

Back in Week 10 I decided to handle Roman numerals with a byacc based parser.
This week I used a very similar but slightly less cumbersome approach and used Parse::Yapp to parse and print the components of a URL.

Also, while I do not usually take the extra time to do the optional API challenges this week I felt like maybe I should give it a try.

As with the previous weekly challenges the problem statements are short and are included in the first comment of the code. The code blocks  shown  link to GitHub Gists. (I prefer to use the gists rather than linking to the code in the main repo since it seems that the gist URLs are more permanent. The main repo is a fork which might get deleted or substantially re-organized at some point.)

Part 1

Sample Run

$ perl perl5/ch-1.pl
4

Discussion

This part of the week's challenge was a nice reminder of recursion in Perl, something which I have spent a lot of time on in the past but haven't done much with recently. I also started to have the main part of the script in a named block, as is done by fellow weekly challenge participant Athanasius. I am in the habit of marking this section of code with a big multiline comment but after looking at some of his code recently I really like this approach!

Part 2

Sample Run 

$ perl perl5/ch-2.pl
SCHEME: jdbc
USER: user
PASSWORD: password
HOST: localhost
PORT: 3306
PATH: /pwc
QUERY: profile=true
FRAGMENT: h1

Discussion

Those code above is, of course, really just calling the functions defined in UrlParser.pm, which is generated by yapp based on UrlGrammar.yp. The steps involved are:

  1. Define a grammar in UrlGrammar.yp. Unless you really prefer writing your lexing functions elsewhere include them in this file as well.
  2. Execute yapp -m UrlParser UrlGrammar.yp. This generates the parser and saves it to a file named UrlParser.pm. Specifying an output filename with the -m option will allow you to give the generated parser a name which may make more sense then the default which in this case would be UrlGrammar.pm. UrlGrammar.yp is shown below.
  3. Write a wrapper to the parser, as I have done in ch-2.pl. Mine here is very small and straightforward since we're just parsing one string. Larger projects will need more code to, say, stream larger bodies of text.
  4. Optional. Execute yapp -v UrlGrammar.yp. This generates UrlGrammar.output which includes the rules and states used by the parser. This also shows any potential parser conflicts. Reviewing this file may be helpful.
  5. Optional. Use UrlGrammar.output to visualize the grammar with GraphViz::Parse::Yapp

The grammar above was informed both by the problem statement in the weekly challenge and also a loosely similar URL grammar that I found.


Script to generate the grammar visualization.
Script to generate the grammar visualization.


Visualization of the grammar.
Visualization of the grammar.

If you review the code (and even just the diagram above!) a little closely you can see that this approach was somewhat an overkill for this problem. The URL components are essentially all determined in lexer(). That function can be described as gnawing away at the URL. That is it works from left to right, matching what it can, removing what has been matched, and then repeating the process until no input remains. Just by looking at the diagram of the grammar you can see that each of the components are considered tokens expected to be passed in from lexer().

The defined grammar itself is really just providing a framework to keep everything organized and conveniently print the components as they are recognized. 

An alternative implementation might just have much of the same code as in the lexer() function on its own or attempt to use a single monolithic regex. I would argue that the approach I took is straightforward, very maintainable, and might be more easily extended if we wanted to go beyond just parsing any single URL at a time.

For anyone interested in exploring parsers in depth, from a utilitarian perspective, I highly suggest Pro Perl Parsing. That book contains a great overview of parsing in general and also explores several specific tools, including Parse:::Yapp.

Part 3

Sample Run

$ perl perl5/ch-3.pl 18 61
O Arjuna, the Lord resides in the region of the heart of all creatures, revolving through Maya all the creatures (as though) mounted on a machine!

Discussion

I haven't been taking the extra time for the optional API challenges, however, this one changed my mind. As a practitioner of bhakti-yoga in the Gaudiya Vaishnava tradition I took this challenge as a sign!

The Bhagavad Gita API requires you to register yourself and then once you re logged in to enroll individual applications which will use the API. I created a ingle application which I called PerlWeeklyChallenge17. After doing this I was given a client id and client secret. These are then used to obtain an authorization token which must accompany each API call. Authorization tokens are good for 300 seconds. In my code above I am not tracking the token lifetimes, however.

I really enjoyed this part of the challenge. There is a lot of room to really have fun with this. Tracking token lifetimes, as mentioned above, is an obvious first enhancement. Another might be to add flexibility in displaying the original texts, not just the translations which I chose to do for the sake of simplicity. 

Finally, I am unsure of the sources used by the Bhagavad Gita API. I would suggest that for any serious reading one use Bhagavad-gītā As It Is.

Comments for this post were locked by the author