1

I am creating a webscraper that needs to open up multiple tabs of items whose icons are filled. For example each page I need to open has div class="course-selector-item-pinned" in it's source code.

<dropdown-content max-width="800" min-width="450" no-padding="" vertical-offset="0" dir="ltr" dropdown-content="" style="--dropdown-verticaloffset:0px;" opened=""><div class="classselector-wrapper" aria-live="assertive"> <div id="classSelectorId" class="placeholder placeholder-live" aria-live="assertive"> <div class="2_7_615 2_8_459 body-compact"> <ul class="datalist vui-list"> <li class="datalist-item datalist-item-actionable datalist-simpleitem vui-selected" id="2_9_421" data-actionid="2_11_656"> <div class="datalist-item-content" title="Class 1"> <div class="class-selector-item class-selector-item-pinned" data-org-unit-id="12345"> <div class="2_160_610 class-selector-item-name"> <a class="link datalist-item-actioncontrol" id="2_11_656" href="/abc/home/12345">Class1</a> </div> <span id="2_10_630" data-active-id="2_161_292" data-inactive-id="2_162_883"><button-icon icon="tier1:pin-filled" id="2_161_292" onclick="O(&quot;__g2&quot;,3)();" text="Un-pin &quot;Class 1&quot;" dir="ltr" type="button"></button-icon> <button-icon icon="tier1:pin-hollow" class="hidden" id="2_162_883" onclick="O(&quot;__g2&quot;,4)();" text="Pin &quot;Class 1&quot;" dir="ltr" type="button"></button-icon> </span></div> </div> <div class="clear"></div> </li> <li class="datalist-item datalist-item-actionable datalist-simpleitem vui-selected" id="2_12_929" data-actionid="2_14_114"> <div class="datalist-item-content" title="Class 2"> <div class="class-selector-item class-selector-item-pinned" data-org-unit-id="23456"> <div class="2_160_610 class-selector-item-name"> <a class="link datalist-item-actioncontrol" id="2_14_114" href="/abc/home/23456">Class 2</a> </div> <span id="2_13_229" data-active-id="2_163_477" data-inactive-id="2_164_80"><button-icon icon="tier1:pin-filled" id="2_163_477" onclick="O(&quot;__g2&quot;,5)();" text="Un-pin &quot;Class 2&quot;" dir="ltr" type="button"></button-icon> <button-icon icon="tier1:pin-hollow" class="hidden" id="2_164_80" onclick="O(&quot;__g2&quot;,6)();" text="Pin &quot;Class 2&quot;" dir="ltr" type="button"></button-icon> </span></div> </div> <div class="clear"></div> </li> <li class="datalist-item datalist-item-actionable datalist-simpleitem vui-selected" id="2_15_372" data-actionid="2_17_26"> <div class="datalist-item-content" title="Class 3"> <div class="class-selector-item class-selector-item-pinned" data-org-unit-id="34567"> <div class="2_160_610 class-selector-item-name"> <a class="link datalist-item-actioncontrol" id="2_17_26" href="/abc/home/34567">Class 3</a> </div> <span id="2_16_595" data-active-id="2_165_349" data-inactive-id="2_166_873"><button-icon icon="tier1:pin-filled" id="2_165_349" onclick="O(&quot;__g2&quot;,7)();" text="Un-pin &quot;Class 3&quot;" dir="ltr" type="button"></button-icon> <button-icon icon="tier1:pin-hollow" class="hidden" id="2_166_873" onclick="O(&quot;__g2&quot;,8)();" text="Pin &quot;Class 3&quot;" dir="ltr" type="button"></button-icon> </span></div> </div> <div class="clear"></div> </li> 

I need the webscraper to find all div classes that has "course-selector-item-pinned" and then take the value in data-org-unit-ids. For example the list would return [12345, 23456, 34567] in this case.

The line of source code I am referring to is:

<div class="class-selector-item-pinned" data-org-unit-id"12345"> <div class="class-selector-item-pinned" data-org-unit-id"23456"> <div class="class-selector-item-pinned" data-org-unit-id"34567"> 

This is what I have done so far the list is not returning anything.

Get List of Unit IDs

courseString='https://example.com/abc/p/home' listofUnitID =[] links = [elem.get_attribute("data-org-unit-id") for elem in driver.find_elements_by_class_name("class-selector-item-pinned")] 

Filter out none type from list

res = [] for val in links: if val != None : res.append(val) print(res) 

List to Keep Only Classes

for i in res: if courseString in i: listOfHref.append(i) print(listOfUnitID) 
1
  • what does the links array contain? does it show any elements?CommentedNov 24, 2020 at 3:12

1 Answer 1

1

If I looks your HTML, actually you want scrape the following div:

<div class="class-selector-item class-selector-item-pinned" data-org-unit-id="34567"> ... ... 

Not this:

<div class="class-selector-item-pinned" data-org-unit-id"12345"> <div class="class-selector-item-pinned" data-org-unit-id"23456"> <div class="class-selector-item-pinned" data-org-unit-id"34567"> 

It causes you code not returning anything, because you are targeting the div with multiple classes.

.find_elements_by_class_name just for single class name.

You can try with .find_elements_by_css_selector('css_selector'), like this:

links = [elem.get_attribute("data-org-unit-id") for elem in driver.find_elements_by_css_selector(".class-selector-item.class-selector-item-pinned")] 

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.