logo
down
shadow

PHP - DOMDocument scrap divs dont remove images


PHP - DOMDocument scrap divs dont remove images

Content Index :

PHP - DOMDocument scrap divs dont remove images
Tag : php , By : BinaryBoy
Date : November 29 2020, 04:01 AM

With these it helps For every div you could use $div->getElementsByTagName("img") to get the image. Then loop the images check if the alt attribute of the img is test and get the data-src attribute:
@$dom->loadHTML($file);
$xpath = new DOMXPath($dom);
$divs = $xpath->query('//div[@class="test"]');
foreach ($divs as $key => $div) {
    echo $div->textContent . "<br>";
    foreach ($div->getElementsByTagName("img") as $img) {
        if ($img->getAttribute('alt') === 'test') {
            echo $img->getAttribute('data-src') . "<br>";
        }
    }
}

Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

How to properly replace inline images with styled floating divs via PHP's DOMDocument?


Tag : php , By : Vorinowsky
Date : March 29 2020, 07:55 AM
around this issue As I commented, your code had multiple errors which prevented you from getting started. Your concept looks quite well from what I see and the code itself only had minor issues.
You were iterating over the document root element. That's just one element, so picking up all images therein. The second xpath must be relative to the child, so starting with .. If you load in a HTML chunk, DomDocument will create the missing elements like body around it. So you need to address that for your xpath queries and the output. The way you accessed the attributes was wrong. With error reporting on, this would have given you error information about that.
$html_from_editor = <<<EOD
<p>Intro Text</p>
<ul>
   <li>List point 1</li>
   <li>List point 2</li>
</ul>
<p>Some text before an image. 
   <img alt="Slide 1" src="/files/slide1.png" /> 
   Maybe some text in between, nobody knows what the scientists are up to. 
   <img alt="Slide 2" src="/files/slide2.png" /> 
   And even more text right after that.
</p>
EOD;

// create DOMDocument
$doc = new DOMDocument();
// load WYSIWYG html into DOMDocument
$doc->loadHTML($html_from_editor);
// create DOMXpath
$xpath = new DOMXpath($doc);

// create list of all first level DOMNodes (these are p's or ul's in most cases)
# NOTE: this is XHTML now
$children = $xpath->query("/html/body/p");

foreach ( $children AS $child ) {
    // now get all images
    $cpath = new DOMXpath($doc);
    $images = $cpath->query('.//img', $child); # NOTE relative to $child, mind the .

    // if no images are found, continue
    if (!$images->length) continue;

    // insert replacement node
    $lb_div = $doc->createElement('div');
    $lb_div->setAttribute("class", "custom");
    $lb_div = $child->parentNode->insertBefore($lb_div, $child);


    foreach ( $images AS $img ) {
        // get attributes
        $atts = $img->attributes;
        $atts = (object) iterator_to_array($atts); // make $atts more accessible    

        // create the new link with lighbox and full view
        $lb_a = $doc->createElement('a');
        $lb_a->setAttribute("href", '/files/fullview'.$atts->src->value);
        $lb_a->setAttribute("rel", "lightbox[slide][".$atts->alt->value."]");

        // create the new image tag for thumbnail
        $lb_img = $img->cloneNode(); # NOTE clone instead of creating new
        $lb_img->setAttribute("src", '/files/thumbs'.$atts->src->value);

        // bring the new nodes together and insert them
        $lb_a->appendChild($lb_img);
        $lb_div->appendChild($lb_a);

        // remove the original image
        $child->removeChild($img);
    }
}

// get body content (original content)
$result = '';
foreach ($xpath->query("/html/body/*") as $child) {
    $result .= $doc->saveXML($child); # NOTE or saveHtml 
}

echo $result;

how to scale divs when i minimize browser so that divs dont collapse over each other


Tag : html , By : bdurbin
Date : March 29 2020, 07:55 AM
With these it helps i have small question. how is it possible to set the heights of 2 divs so that they dont collapse but rather scale dynamically if i minimize the window? i did this in js: , I think you need to bind to the resize event
$(document).ready(function(){
  var sidebar = document.getElementById('sidebar').offsetHeight;
  $(window).resize(function(){
    var footer = document.getElementById('footer').offsetHeight;
    document.getElementById('sidebar').style.height = sidebar - footer + 'px';
  });
});

Black space between 2 of my divs. I dont know how to remove them. Have tried a couple of things


Tag : html , By : Steve
Date : March 29 2020, 07:55 AM
fixed the issue. Will look into that further add margin-top: -4px; to .playerName will pull it up and you won't have the space between the two divs.
SEE THE FIDDLE

Onclick of images add and remove class to multiple Divs


Tag : javascript , By : Nosayaba
Date : March 29 2020, 07:55 AM
will help you You can reset all these things using jQuery#attr attribute of div.
    $(".full.swatch-img").on('click',function(){

        let colorCode = $(this).attr('id');

        $(".adult-section-box .pc-row, .youth-section-box .pc-row")
        .attr('class',colorCode+" pc-row");

    })

How to iterate through hidden divs and scrap text?


Tag : python , By : Eric
Date : March 29 2020, 07:55 AM
should help you out Speech details are loaded using an AJAX request. This means you don't even have to use selenium for this, requests alone is enough, which speeds up things quite a bit:
import requests
from bs4 import BeautifulSoup

headers = {
    'User-Agent':  'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'
}


def make_soup(url: str) -> BeautifulSoup:
    res = requests.get(url, headers=headers)
    res.raise_for_status()
    return BeautifulSoup(res.text, 'html.parser')


def fetch_speech_details(speech_id: str) -> str:
    url = f'https://pm.gc.ca/eng/views/ajax?view_name=news_article&view_display_id=block&view_args={speech_id}'
    res = requests.get(url, headers=headers)
    res.raise_for_status()
    data = res.json()
    html = data[1]['data']
    soup = BeautifulSoup(html, 'html.parser')
    body = soup.select_one('.views-field-body')
    return str(body)


def scrape_speeches(soup: BeautifulSoup) -> dict:
    speeches = []
    for teaser in soup.select('.teaser'):
        title = teaser.select_one('.title').text.strip()
        speech_id = teaser['data-nid']
        speech_html = fetch_speech_details(speech_id)
        s = {
            'title': title,
            'details': speech_html
        }
        speeches.append(s)


if __name__ == "__main__":
    url = 'https://pm.gc.ca/eng/news/speeches'
    soup = make_soup(url)
    speeches = scrape_speeches(soup)
    from pprint import pprint
    pprint(speeches)

[
    {'title': 'PM remarks for Lunar Gateway', 'details': '<div class="views-field views-field-body"> <p>CHECK AGAINST DELIVERY</p><p>Hello everyone!</p><p>I’m delighted to be here at the Canadian Space Agency to share some great news with Canadians.</p><p>I’d like to start by thanking the President of the Agency, Sylvain Laporte ... },
    {...},
    ....
]
Related Posts Related QUESTIONS :
  • PHP array indexing: $array[$index] vs $array["$index"] vs $array["{$index}"]
  • PHP4 to PHP5 Migration
  • Making a production build of a PHP project with Subversion
  • Add 1 to a field
  • Better Random Generating PHP
  • Accessing a CONST attribute of series of Classes
  • Locking a SQL Server Database with PHP
  • Version control PHP Web Project
  • How to sell Python to a client/boss/person
  • How to easily consume a web service from PHP
  • How to include PHP files that require an absolute path?
  • Multi-Paradigm Languages
  • PHP Error - Uploading a file
  • MySQL/Apache Error in PHP MySQL query
  • Lightweight IDE for Linux
  • How to search a row that has been joined with another table
  • How to do mysql LEFT JOIN for Google multiple line chart (php, mysql)?
  • Find the two longest strings separated by dash in PHP
  • Cannot access private property with Set method in Symfony
  • How to compare 2 array of PHP objects then fill in empty string if data not match?
  • strtotime gives back false result to strings in an array
  • How to create Url that contain name and id laravel
  • Codeigniter: Models not working with an error: Undefined property: CI_Loader
  • How to use + or - operators as parameters in Doctrine prepared SQL statement?
  • How to run scrapy with url parameters from php on linux debian
  • Preg matching imgur.com links
  • Php - Code comparing value of database to the value send by android app
  • Laravel pagination object in javascript
  • Error in validating message of no votes PHP
  • Symfony site on Azure with ClearDB connection
  • Woocommerce upsell with checkboxes
  • How to login/logout with Hybridauth 3?
  • Convert INT number in PHP
  • php/mysql. selecting 2 common fields in 2 tables
  • Expand an string based on values in an array in PHP
  • When you absolutely have to manually escape SQL in CakePHP 3.4.7
  • Htaccess redirecting wrong page
  • Laravel Optional WHERE clause
  • PHP unpack overlfow variable memory limit
  • How to check the URL's structure using PHP preg_match?
  • PHP preg_replace - text will not be recognized
  • Show image from other domain without showing the domain name - PHP
  • How to run and watch local server at the same time?
  • Access form request from Observer laravel
  • single quote problem with preg_replace_callback
  • How to Create A Unique URL for each product enquiry form for Google Analytics Tracking
  • Laravel email configuration: what am I missing?
  • Assign indexed array values to multidimensional array?
  • Custom WordPress stylesheet not loading
  • Data attribute only returning first value
  • How to list all roles with Permissions,in spatie permission Package?
  • How to achieve this number pattern with n input
  • How to iterate array inside array data in laravel ,output are given below
  • To get output of movie with its genre
  • Extracting some data from a JS object literal string in PHP using Regex
  • axios and vuex are deleting wrong index in array
  • Setting empty CURLOPT_POSTFIELDS
  • Laravel 5.8 use subdomain as API endpoint beside domain.com/api
  • Error while add data in database SQL Server 2016
  • Does PhpStorm allow to skip PHPDoc tags when type hints is declared?
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com