I just completed level 2 of The Python Challenge on pythonchallenge.com and I am in the process of learning python so please bear with me and any silly mistakes I may have made.
I am looking for some feedback about what I could have done better in my code. Two areas specifically:
- How could I have more easily identified the comment section of the HTML file? I used a beat-around-the-bush method that kind of found the end of the comment (or the beginning technically but it is counting from the end) and gave me some extra characters that I was able to recognize and anticipated (the extra "-->" and "-"). What condition would have better found this comment so I could put it in a new string to be counted?
This is what I wrote:
from collections import Counter import requests page = requests.get('http://www.pythonchallenge.com/pc/def/ocr.html') pagetext = "" pagetext = (page.text) #find out what number we are going back to i = 1 x = 4 testchar = "" testcharstring = "" while x == 4: testcharstring = pagetext[-i:] testchar = testcharstring[0] if testchar == "-": testcharstring = pagetext[-(i+1)] testchar = testcharstring[0] if testchar == "-": testcharstring = pagetext[-(i+2)] testchar = testcharstring[0] if testchar == "!": testcharstring = pagetext[-(i+3)] testchar = testcharstring[0] if testchar == "<": x = 3 else: i += 1 x = 4 else: i += 1 x = 4 else: i += 1 print(i) newstring = pagetext[-i:] charcount = Counter(newstring) print(charcount)
And this is the source HTML:
<html> <head> <title>ocr</title> <link rel="stylesheet" type="text/css" href="../style.css"> </head> <body> <center><img src="ocr.jpg"> <br><font color="#c03000"> recognize the characters. maybe they are in the book, <br>but MAYBE they are in the page source.</center> <br> <br> <br> <font size="-1" color="gold"> General tips: <li>Use the hints. They are helpful, most of the times.</li> <li>Investigate the data given to you.</li> <li>Avoid looking for spoilers.</li> <br> Forums: <a href="http://www.pythonchallenge.com/forums"/>Python Challenge Forums</a>, read before you post. <br> IRC: irc.freenode.net #pythonchallenge <br><br> To see the solutions to the previous level, replace pc with pcc, i.e. go to: http://www.pythonchallenge.com/pcc/def/ocr.html </body> </html> <!-- find rare characters in the mess below: --> <!--
Followed by thousands of characters and the comment concludes with '-->'