Whoa! Post 100 for this little blog. Although at times it’s been a little weak on content, I think there have been enough good ones to outweigh them. Besides, this site is more for me than for anyone else.
As this is post 100, I’m required by a dubious interpretation of a little known Norwegian law to list my favorite posts so far. So here they are, slightly categorized.
Mildly Useful
http://www.velvetcache.org/2006/09/28/usb-apps/
http://www.velvetcache.org/2006/09/17/firefox-plugins/
http://www.velvetcache.org/2006/10/02/simple-php-caching/
http://www.velvetcache.org/2007/01/19/cleaning-up-e-books/
http://www.velvetcache.org/2007/01/08/renaming-your-the-folders/
http://www.velvetcache.org/2007/01/08/pop-can-bookmarkers/
Wordy And Thoughtful
http://www.velvetcache.org/2006/09/15/i-cant-be-your-john-cusack/
http://www.velvetcache.org/2006/11/01/rediscovery/
http://www.velvetcache.org/2006/09/19/complexity-vs-redundancy/
http://www.velvetcache.org/2006/09/20/linux-on-the-desktop/
http://www.velvetcache.org/2006/09/07/facebook-apis/
http://www.velvetcache.org/2006/10/03/moniker-junkie/
Now to the meat of this post!
When I was first hired at UNO I was given the transfer articulation site as a project. What they basically do is keep track from year to year what each class at a number of schools is equivalent to here at UNO. I wrote it pretty quick, and they’ve been slowly adding data by hand for a few months now.
It’s a lot of data to enter too. So far they only have one year of one school done. The old system was a series of static HTML pages, so they didn’t think they could load it into the new system. I didn’t agree fully, because although the pages were poorly written and differed from year to year, they had a standard table layout on each one. I got to work on the idea of extracting the old data and loading it into the new database.
The first thing to do was create a syntactically correct file, here’s a sample of part of one of the files:
...
<TD>
<CENTER>
Max.<br/>
Transfer Hours<br/>
Allowed
</CENTER>
<TD>
<P>
<CENTER>
Comments
</CENTER>
<TR>
<TD height="33">
<P>ARCH1300
<TD>
<P>Architectural Desktop I
...Nasty, all-caps and they didn’t even close the tags. Ugly, ugly.
Luckily I knew of a secret weapon, HTML Tidy!
When run through with the appropriate flags I got this lovely version of the code:
... <td> <div class="c2"> Comments </div> </td> </tr> <tr> <td height="33"> <p>ARCH1300</p> </td> <td> <p>Architectural Desktop I</p> </td> ....
Okay, so running the tidy command on every file one at a time would be crazy, so I wrote up a short batch file to hit every single .html file with the tidy love. Please excuse the nasty one-lined-ness of it.
FOR %%f IN ("*.html") DO tidy %%f --char-encoding utf8 --clean yes --doctype strict --escape-cdata yes --indent auto --indent-attributes no --join-classes yes --output-xhtml yes --show-errors 99 --tidy-mark no --wrap 0 -m "%%f"
Okay, so now that we’ve got that beautified file, we need to parse it. To make life easier I stripped all the other tags out except for the table tags with PHP’s strip_tags(). I then regexed out anything that wasn’t inside the <table> tags and created my SimpleXML object with what was left over.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | $filename = $_REQUEST['dir'].'/'.$_REQUEST['file']; // Remove the space in fopen, it's a Wordpress thing. :( $handle = f open($filename, "r"); $contents = fread($handle, filesize($filename)); fclose($handle); $contents = strip_tags($contents,'<table><tr><td>'); if(preg_match('/<table.*>([.\r\n]*)<\/table>/si', $contents, $matches)) { print 'Will attempt to load. '; $contents = $matches[0]; $xml = new SimpleXMLElement($contents); if(!$xml) { print 'Error! XML Didn\'t read right.'; exit(); } print 'Load complete. Checking for expected structure. '; if(trim($xml->tr[0]->td[0]) != 'Course') { print 'Error! Structure not as expected. Dumping.<br/>'; var_dump($xml); exit(); } } |
A few loops and a whole lot of boring processing later and it’s all in the DB. But that’s the gist of the system and it’s frickin tight.
Posted January 23rd, 2007 - PermalinkI honestly don’t know what else to call this, so “Javascript Attention Grabber” will have to do. We have a site at UNO that has an alphabetic list and you can click on a letter at the top of the page to jump to it. However, the pages aren’t always long enough for the jump to happen, yet are still cluttered enough to get lost without it. Thus I whipped up this simple little guy to highlight the div for a short time.
I hardcoded in the return background color because tempColor = tempObj.style.backgroundColor; wasn’t pulling off the old color. I’ll have to figure that one out later. I’d also like to add a fader, but that was too much work at the time. ANyway, here’s the first version.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | <script type="text/javascript"> <!-- function hilighter (targetID,color) { targetObj = document.getElementById(targetID); tempColor = "#F6F6F6"; targetObj.style.backgroundColor = color; setTimeout('unhiliter("'+targetID+'","'+tempColor+'")',1000); } function unhiliter (targetID,color) { document.getElementById(targetID).style.backgroundColor = color; } --> </script> |
<a href="#anchorName" onclick="hiliter('targetID','#EAD0D1');">Link</a>P.S. Yet again I’m faced with the ugly usability problems of this website. I need to take the time to fix it up. Let me go make a note of that on my Nokia 770. :)
Posted November 3rd, 2006 - Permalink