4
\$\begingroup\$

I wrote some code to extract the information from a table, but it takes an extremely long time.

The table is in the format of a calendar. I need the information on an Excel sheet with column 1 as the day number and column 2 as the accommodation type. On the accommodation type, I only want the “Camp” types and the number of available units to show. I’m sure there will be a better way to loop through the table. Any help will be appreciated.

Here is my code:

Dim driver As New WebDriver, i As Integer, mysheet As Worksheet Dim ele As WebElement Set driver = New EdgeDriver Application.ScreenUpdating = False driver.Start "edge" driver.Get "http://www.sanparks.org/reservations/accommodation/calendar-month/park/26/camp/41/date/2022-07-01" Application.Wait Now + TimeValue("00:00:05") Set mysheet = Sheets("Sheet1") i = 7 eRow = Sheet1.Cells(Rows.Count, 3).End(xlUp).Offset(7, 0).Row For Each ele In driver.FindElementsByClass("day") If ele.FindElementByCss("span").Text <> "" Then mysheet.Cells(i, 2).Value = ele.FindElementByCss("span").Text On Error Resume Next If Mid(ele.FindElementByClass("bold-name").Text, 1, 4) = "Camp" Then 'AccommodationType mysheet.Cells(i, 3).Value = ele.FindElementByClass("bold-name").Text & " " & ele.FindElementByClass("unit-numbers").Text End If On Error GoTo 0 i = i + 1 End If Next ele 
\$\endgroup\$
1

2 Answers 2

4
\$\begingroup\$

If you aren't locked into using Selenium, you can accomplish something similar to this using a web request. Web Requests should be faster as they don't need to worry about rendering any items on screen, they just return data.

I looked at the web traffic, and noticed this site uses an API to provide the data to the front end. I'm using this API, to return the data as JSON, parsing that, then returning that to an Excel range. This takes about 2 seconds on my machine.

In order to get this to work, you'll need a copy of this --> https://github.com/VBA-tools/VBA-JSON in your project. Follow the directions on the project page to get that setup.

Option Explicit Public Sub ScrapeCampsites() Dim URL As String URL = "https://www.sanparks.org/includes/SANParksApp/API/v1/bookings/accommodation/getAvailabilityAccommodationMonthList.php?resort=10&month=7&year=2022" Dim response As String With CreateObject("MSXML2.XMLHTTP.6.0") .Open "GET", URL, False .send response = .responseText End With 'Include https://github.com/VBA-tools/VBA-JSON into your project Dim Json As Object: Set Json = JsonConverter.ParseJson(response) Dim Accomodations As Object: Set Accomodations = Json("DATA") Dim Accomodation As Variant Dim Availabilities As Variant Dim Availability As Variant Dim Results As Variant ReDim Results(1 To 2, 1 To 50000) Dim i As Long For Each Accomodation In Accomodations If Mid$(Accomodation("accommodationtypedesc"), 1, 4) = "Camp" Then Availabilities = Accomodation("availabilities").Items For Each Availability In Availabilities i = i + 1 Results(1, i) = Availability("availableDate") Results(2, i) = Availability("available") Next End If Next ReDim Preserve Results(1 To 2, i) Dim mySheet As Worksheet Set mySheet = Sheets("Sheet1") mySheet.Range("A1:B1").Value = Array("Date", "Available") mySheet.Range("A2:B" & UBound(Results, 2) + 1) = Application.WorksheetFunction.Transpose(Results) End Sub 
\$\endgroup\$
7
  • \$\begingroup\$Out of interest, how did you inspect the traffic, what was the process?\$\endgroup\$
    – Greedo
    CommentedMar 7, 2022 at 22:04
  • \$\begingroup\$I used the network tab in Chrome to via the web requests back and forth on the page. I noticed one of the endpoints had API in the URI, so I took a peek at what was included in the response. It was the JSON that feeds into the calendar on the page.\$\endgroup\$CommentedMar 8, 2022 at 13:32
  • \$\begingroup\$Greedo - I had my vba code checking the availability on IE and not through a webdriver. This took about 2 seconds to complete.\$\endgroup\$
    – Nelj
    CommentedMar 11, 2022 at 4:50
  • \$\begingroup\$Ryan Wildry - Thank you Ryan for having a look. I have followed the instructions and got the following error: KeyNotFoundError - Dictionary key not found: STATUS The error is on the following line: "json_ParseObject.Item(json_Key) = json_ParseValue(json_String, json_Index)"\$\endgroup\$
    – Nelj
    CommentedMar 11, 2022 at 4:54
  • \$\begingroup\$@Nelj This is a weird error, I don't see STATUS listed in the keys for code above. It might be related to requiring a reference to Microsoft Scripting Runtime the json parser uses this data structure to deserialize the data.\$\endgroup\$CommentedMar 11, 2022 at 13:57
2
\$\begingroup\$

I notice a couple of things:

  1. I found reference that says "Webdriver Get will wait until the page has fully loaded before returning the control" so we don't need to wait 5 seconds.

  2. FindElementsByClass is an expensive operation. We should never call it multiple times if we can avoid it. I've implemented storing the results in a variable to cut the calls in half per loop.

Try this out and let me know if it's any faster:

Application.ScreenUpdating = False Dim driver As WebDriver Set driver = New EdgeDriver driver.Start "edge" driver.Get "http://www.sanparks.org/reservations/accommodation/calendar-month/park/26/camp/41/date/2022-07-01" Dim mysheet As Worksheet Set mysheet = Sheets("Sheet1") Dim i As Integer i = 7 eRow = mysheet.Cells(Rows.Count, 3).End(xlUp).Offset(7, 0).Row Dim dayClass As Variant dayClass = driver.FindElementsByClass("day") Dim ele As WebElement For Each ele In dayClass Dim spanText As String spanText = ele.FindElementByCss("span").Text If spanText <> "" Then mysheet.Cells(i, 2).Value = spanText Dim boldText As String boldText = ele.FindElementByClass("bold-name").Text On Error Resume Next If Mid(boldText, 1, 4) = "Camp" Then 'AccommodationType mysheet.Cells(i, 3).Value = boldText & " " & ele.FindElementByClass("unit-numbers").Text End If On Error GoTo 0 i = i + 1 End If Next ele ```
\$\endgroup\$
2
  • \$\begingroup\$Thank you for having a look at my problem and the advice. I have tried your solution. It is working, but still takes a long time to complete. I had the following code when the website used to be available on IE and I didn't have to work through a web driver (this is only the web scraping part):\$\endgroup\$
    – Nelj
    CommentedMar 12, 2022 at 5:26
  • \$\begingroup\$eRow = Sheet1.Cells(Rows.Count, 3).End(xlUp).Offset(7, 0).Row For Each ele In .document.all Select Case ele.className Case "day" RowCount = RowCount + 1 sht.Range("B" & RowCount) = ele.innerText Case "units" If Mid(ele.NextSibling.innerText, 1, 4) = StandType Then sht.Range("C" & RowCount) = ele.NextSibling.innerText & " - " & ele.innerText End If End Select Next ele\$\endgroup\$
    – Nelj
    CommentedMar 13, 2022 at 7:06

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.