Skip to content

Releases: aborruso/scrape-cli

Release v1.2.0

07 Sep 11:41

Choose a tag to compare

Bug Fix

  • Fixed XPath detection for expressions wrapped in parentheses
  • XPath expressions like (//div[@class='coordinate lat'])[1] are now correctly recognized as XPath instead of being incorrectly treated as CSS selectors
  • Enhanced the is_xpath function with additional pattern recognition for XPath-specific syntax including attribute predicates, position predicates, and XPath functions

Installation

pip install scrape_cli==1.2.0

v1.1.9: CSS Selector Fix

14 Aug 07:29

Choose a tag to compare

What Changed

🐛 Bug Fix: Fixed CSS selector parsing that was incorrectly identified as XPath

Details

  • Fixed is_xpath() function to properly distinguish CSS selectors from XPath expressions
  • CSS selectors like a[href*="/talk/"] now work correctly
  • Improved selector recognition logic to be more restrictive for XPath detection

Technical Changes

  • Updated XPath detection to only recognize expressions starting with / or // or containing ::
  • This prevents CSS attribute selectors with square brackets from being misidentified as XPath

Installation

pip install --upgrade scrape-cli

v1.1.8

02 Jun 13:11

Choose a tag to compare

What's Changed

Features

  • Added text extraction functionality with -t option
    • Extract only text content without HTML tags
    • Automatically excludes text from script and style tags
    • Cleans up whitespace for better readability
    • Particularly useful for LLMs and text processing workflows
    • Can be combined with CSS selectors or XPath expressions for targeted text extraction

v1.1.7

04 May 13:39

Choose a tag to compare

What's Changed

Features

  • Improved XPath detection with support for complex expressions:
    • Added support for predicates and square brackets
    • Added support for XPath functions (last(), position(), contains(), text())
    • Added support for XPath axes and attributes
    • Better handling of complex XPath expressions

v1.1.6

02 May 17:53

Choose a tag to compare

  • Added charset detection from HTML meta tags
  • Added support for ISO-8859-1 encoding fallback
  • Improved HTML parsing with better encoding handling

v1.1.1

02 Nov 14:50

Choose a tag to compare

update

1.1

02 Nov 14:06

Choose a tag to compare

1.1
Bump version to 0.2

v1.0

29 Nov 16:44
ca91998

Choose a tag to compare

Update README.md