You are viewing [info]black_chair's journal

Previous Entry | Next Entry

Portland Beavers 2009 Schedule

  • Feb. 11th, 2009 at 9:06 AM
normal
The Portland Beavers don't seem to offer their schedule in any electronic format other than their web page.

I like to have these things in my calendar, so that I know what I'm missing when I'm away. So I scraped their "printable" page, ran it through a little script, and produced a standard iCalendar file. (Yes, it really is a standard (RFC-2445), and it doesn't have anything to do with Apple's iCal.)

You can subscribe to it if you want.


To start with, I cut-n-pasted the text from the "printable" schedule page and saved it in a file.

Then I ran it through this little script:
#!/usr/bin/perl -n

BEGIN {
print <<'EOF';
BEGIN:VCALENDAR
METHOD:PUBLISH
X-WR-TIMEZONE:US/Pacific
PRODID:-//Apple Inc.//iCal 3.0//EN
CALSCALE:GREGORIAN
X-WR-CALNAME:Portland Beavers
VERSION:2.0
EOF
}

chomp;
next unless /^\d/;
my ($mon, $day, $year, $time, $off, $away, $what, $outcome, $score) = ($_ =~ m#(\d+)/(\d+)/(\d+)\s+(?:(\d+:\d+\s+(A|P)M|TB[AD]))\s+(@?)(.*?)(?:\s+([WL])\s+(\d+-\d+))?\s*$#);
my ($hr, $min) = ($time =~ /(\d+):(\d+)/);
if ($time =~ /^TB/) {
  $hr = 12;
  $min = 0;
  $what .= " [Time $time]";
}
if ($away) {
  $what = "@ $what";
} else {
  $what = "vs $what";
}
if ($outcome) {
  $what .= " ($outcome $score)";
}
if (!defined($mon)) {
  print STDERR "Unknown: '$_'\n";
  next;
}
$hr += 12 if $off eq 'P' && $hr != 12;
my $start = sprintf("%04d%02d%02dT%02d%02d00", $year, $mon, $day, $hr, $min);
$hr += 2;
if ($hr > 24) {
  # This shouldn't happen
  $hr -= 24;
  $day += 1;
}
$min += 30;
my $end = sprintf("%04d%02d%02dT%02d%02d00", $year, $mon, $day, $hr, $min);

print <<"EOF";
BEGIN:VEVENT
DESCRIPTION:$what
URL;VALUE=URI:www.portlandbeavers.com
DTSTART;TZID=US/Pacific:$start
SUMMARY:$what
DTEND;TZID=US/Pacific:$end
END:VEVENT
EOF

END {
print "END:VCALENDAR\n";
}


Like so: cat foo | perl sched.pl > beavers.ics

Then import the calendar into iCal and publish it via my previously setup calendar DAV service.

I did do a little manual fix-up; basically just removing the leading "vs" from the All-Star Game and related activities.

If I get really industrious, I'll repeat the process occasionally to keep the record up-to-date. I might also do the promo schedule as well. But even if I don't do any of that, at least now we'll know date, time, and opponent.

Tags:

Comments

( 12 comments — Leave a comment )
[info]watanukisuki wrote:
Feb. 11th, 2009 05:44 pm (UTC)
Yay! Thank you! :D
[info]teh_kei wrote:
Feb. 11th, 2009 08:12 pm (UTC)
I always enjoy reading your scripts :P your like an evil genius. althou this one looks a bit simpler then others.
except for this crazy statement

my ($mon, $day, $year, $time, $off, $away, $what, $outcome, $score) = ($_ =~ m#(\d+)/(\d+)/(\d+)\s+(?:(\d+:\d+\s+(A|P)M|TB[AD]))\s+(@?)(.*?)(?:\s+([WL])\s+(\d+-\d+))?\s*$#);
my ($hr, $min) = ($time =~ /(\d+):(\d+)/);
[info]black_chair wrote:
Feb. 11th, 2009 08:22 pm (UTC)
Yeah, this was definitely a quick throw-together, with a few revisions to handle the obvious errors that cropped up.

And when I'm writing regexps for myself, I don't usually spend any effort to make them especially readable. :) But that's the heart of the whole thing -- the whole effort depends on the data that that one line extracts.

If you (or anyone else) wants, I can break it down.
[info]teh_kei wrote:
Feb. 11th, 2009 08:29 pm (UTC)
I would love for you to break that down for me. I really need to brush up on my regular expression stuff. I'm really bad at it.
[info]black_chair wrote:
Feb. 12th, 2009 03:08 am (UTC)
Okay. For review, here's the whole thing:
m#(\d+)/(\d+)/(\d+)\s+(?:(\d+:\d+\s+(A|P)M|TB[AD]))\s+(@?)(.*?)(?:\s+([WL])\s+(\d+-\d+))?\s*$#

And a representative line of input:
04/03/2008       07:05 PM       Fresno Grizzlies         W 4-8


In Perl, when you use forward slashes to delimit the regexp, the 'm' (which is an operator) is implied. But because I was going to have forward slashes as part of the regexp, I want to use a different delimiter, and thus 'm#' which is "match, delimited by hash symbols" (octothorpes, if you like).

If I'd used the "x" (eXtended legibility) option, I could've written it like this:
m{
  (\d+)         # Grab the month and put it in $1
  /             # Match the date delimiter
  (\d+)         # Grab the day and put it in $2
  /             # Match the date delimiter
  (\d+)         # Grab the year and put it in $3
 \s+            # There's some whitespace between the fields; eat it
  (?:           # Now it's time to match the time, which will either be  "HH:MM AM"
                 # "HH:MM PM", "TBD" or "TBA".  This starts a group whose contents won't
                 # be saved in a backreference.  Hey, what do you know?  This is totally
                 # useless here.  I should've left it out.  Anyway...
    (            # Start a group that will save whatever's matched in $4
      \d+:\d+\s+(A|P)M   # Match the HH:MM [AP]M, and store the A or P in $5
      |           # OR, if it's not HH:MM...
      TB[AD]      # Match TB followed by A or D.  They use TBD, but I wanted to be prepared
                  # for TBA as well.
    )            # This is the end of the time group
   )            # And this is the end of the useless group.
  \s+           # Once again, whitespace between the fields
  (@?)          # If there's an '@' before the team name, save it in $6.  For home games
                # there's no '@', so $6 will be empty.  The '?' means to match 0 or 1 times.
  (.*?)         # Match everything that comes next (".*") but don't be greedy ("?").  IOW,
                # Don't consume stuff that things after this bit might match.  All this
                # gets saved in $7.
  (?:           # Start another non-save group.  This is just to accomodate the '?' a few
                 # lines down, which basically means "it's okay if none of this stuff is here"
    \s+          # Match whitespace between the team name and the game record
    ([WL])       # Win or loss?  Save it in $8
    \s+          # Eat whitespace between the outcome and the score
    (\d+-\d+)    # Match NUM-NUM (the score) and put it in $9.
  )             # End of the game record group
  ?             # And it's okay if it doesn't match (as it won't for games that haven't been
                # played yet.
  \s*           # If there's any whitespace left after everything else matches, this will eat it.
  $             # This anchors to the end of the line
}x             # And this is the end of the regexp with the "x" modifier.


I did the matching and the variable assignment all in one step, but it could've easily been two, as in:
m#(\d+)/(\d+)/(\d+)\s+(?:(\d+:\d+\s+(A|P)M|TB[AD]))\s+(@?)(.*?)(?:\s+([WL])\s+(\d+-\d+))?\s*$#);
my ($mon, $day, $year, $time, $off, $away, $what, $outcome, $score) = ($1, $2, $3, $4, $5, $6, $7, $8, $9);


When you have a match that you don't bind explicitly to a variable, use of "$_" is assumed. In general I think use of that reduces readability, but for short scripts (especially ones that run with -n or -p) it's okay.



Edited at 2009-02-12 03:29 am (UTC)
[info]teh_kei wrote:
Feb. 12th, 2009 03:42 am (UTC)
Ok, when you break it down like that, its alot easier to understand. thanks! I feel slightly smarter now :D
[info]driftpeasant wrote:
Feb. 12th, 2009 02:38 am (UTC)
Beavers? Nah dude. You need the glory that is the Portland Timbers.
[info]black_chair wrote:
Feb. 12th, 2009 02:46 am (UTC)
They both play at PGE Park, just in different seasons. If I'm going to sit in the cold and watch a sport, it's going to be hockey.
[info]rlusardi.myopenid.com wrote:
Apr. 22nd, 2009 06:59 pm (UTC)
Got it to work on the Timbers schedule with out any changes. Great job, and thanks.
[info]black_chair wrote:
Apr. 22nd, 2009 07:37 pm (UTC)
I am happy to be of service.
(Anonymous) wrote:
May. 17th, 2009 04:40 pm (UTC)
Timbers
Little slow at this stuff. How do you get t to work for the timbers schedule?
[info]black_chair wrote:
May. 17th, 2009 05:39 pm (UTC)
Re: Timbers
That wasn't me... you'll have to ask the guy who actually did it...
( 12 comments — Leave a comment )