Programming in Perl

Homework 3

Frequently Asked Questions

  1. My program is only opening the first file, and fails for every file after that. What's wrong???
    Almost certainly, it's because you're not using glob() correctly. Recall that glob() returns undef after it has cycled through all possible expansions of its argument. If there's only one possible expansion, it will return undef on the 2nd attempt. The best idea for this homework is to use glob() no more than once in your entire program:

    print "Enter a user name:\n";
    chomp ($user = <STDIN>);
    ($dir) = glob("~$user/");
    #now use $dir throughout your file.  Never call glob again
    
  2. What exactly does the user enter when prompted for an RCS Id?
    The user will enter only the RCS Id. He will not enter a tilde or directory path. So if your program was going to search my public_html directory, lallip would be read from STDIN.
  3. What do we do if the "clickable text" is actually other HTML tags such as images or formatting?
    It is acceptable to simply print the actual HTML text that comes between <a href="..."> and </a>
  4. Do we need to search for HTML files in subdirectories of the public_html directory?
    No, recursive searching is not part of the assignment.
  5. If quotes surround the link address, should they be printed out?
    No. Under no circumstances should quotation marks surround your outputted link address
  6. May we assume that if a link address has an opening quote it will have a closing quote, and that if it doesn't it won't?
    Yes, you may make this assumption. I'm not even sure a browser would recognize a link with only one quotation mark.
  7. Do we have to worry about any HTML <base href="..."> tags?
    No, but that sounds like a heck of an idea for a couple above and beyond points...
  8. What should happen to the formatting of the local copy if the phrase "Rensselaer University" is broken up onto two lines?
    Programmer's perogative. My program deletes the newline entirely. Yours can delete the newline, put it before RPI, or put it after RPI.
  9. Should 'rensselaer' be replaced if it is within a hyperlink?
    Yes. ALL instances of Rensselaer and Renssealer University must be replaced. This includes 'normal' text, links, clickable text, titles, and comments.
  10. Should the list of links sent to standard output come from the original file, or from our local copy with the modifications?
    The original, unmodified file.
  11. Should we print out <a > tags that do not have an href (ex: name, id, etc)?
    No. Only hyperlinks should be printed. Note that this includes all proticols, including http, ftp, telnet, and mailto, as well as local links that do not have a proticol specified.
  12. May we assume that the word "Institute" will always follow "Rensselaer Polytechnic"?
    No! This is not a valid assumption. Under the rules on the homework, if you encounter "Rensselaer Polytechnic" followed by anything other than "Institute", you must replace it with "RPI Polytechnic"