Parser fails for lecker.de recipes: JSON-LD block not found (multi-block issue)

**Description**
The Nextcloud Cookbook app fails to import recipes from `lecker.de` using the URL import feature. 

The application returns the error: *"Für den angegeben Import konnte kein Parser gefunden werden"* (No parser found for the specified import).

Investigation shows that `lecker.de` provides multiple JSON-LD blocks on their recipe pages (e.g., `Organization`, `WebSite`, `BreadcrumbList`, etc.) before the actual `Recipe` block. The current implementation of the Cookbook parser appears to look only at the first JSON-LD block (which is an `Organization` type) or fails to identify the correct source block, leading to the error. 

Since the `Recipe` schema exists further down on the page within the JSON-LD data (at block index 7), the parser should be updated to iterate through all JSON-LD blocks to locate the one with `@type: Recipe`.

**Reproduction**
Steps to reproduce the behavior:
1. Copy a URL from lecker.de (e.g., `https://www.lecker.de/haehnchen-doener-46850.html`).
2. Go to the Cookbook app in Nextcloud.
3. Click on **'Import'**.
4. Paste the URL and submit.
5. See error. "Für den angegeben Import konnte kein Parser gefunden werden."

**Expected behavior**
The parser should iterate through all available `application/ld+json` blocks on the page and identify the one where `@type` is `Recipe`, allowing a successful automatic import of the recipe data.

**Actual behavior**
The import fails with the error message: *"Für den angegeben Import konnte kein Parser gefunden werden."* 
Analyzing the page source reveals 8 JSON-LD blocks, and the actual recipe data is located in the 8th block (index 7). A manual file-based JSON import of this specific block works flawlessly.

**Browser**
*All  Firefox, Chrome and  Edge*

**Versions**
Nextcloud server version: Nextcloud Hub 26 Spring (34.0.1 RC2)
Cookbook version: 0.11.7
Database system: MySQL/MariaDB (Standard installation)








## Issue Description
When attempting to import recipes from certain websites (e.g., those using Cloudflare or specific compression configurations), the import fails with an internal server error. 

The underlying cause is a `Warning: DOMDocument::loadHTML(): Bytes: 0x1F 0x8B in Entity` thrown by `libxml` inside the `HttpJsonLdParser`. This happens because the remote server sends the response headers or content compressed via GZIP/Deflate, but the cookbook app's internal cURL request does not handle or declare support for decompression. As a result, `DOMDocument` attempts to parse raw, binary GZIP data (which starts with the magic bytes `0x1F 0x8B`), causing the parser to fail.

---




## Workaround / Fix

We resolved the issue locally by applying two changes. The first one fixes the root cause, while the second acts as a defensive buffer for large/complex HTML documents.

### 1. Fix the Root Cause (Enable cURL Decompression)
We added `CURLOPT_ENCODING => ''` to the cURL options in `HtmlDownloadService.php`. This explicitly tells cURL to send all compression headers it supports (`Accept-Encoding: gzip, deflate, br`) and **automatically decompresses** the content in memory before passing the clean HTML string over to PHP.

**File modified:** `lib/Service/HtmlDownloadService.php`
```php
// Inside fetchHtmlPage(string $url) method (around line 114)
$opt = [
    CURLOPT_USERAGENT => 'Mozilla/5.0 (X11; Linux x86_64; rv:129.0) Gecko/20100101 Firefox/129.0',
    CURLOPT_ENCODING  => '', // <-- ADDED THIS LINE to automatically handle GZIP/Deflate
];




File modified: lib/Helper/HTMLParser/HttpJsonLdParser.php
// Inside parse(\DOMDocument $document, ?string $url) method
// Change the loading configuration to include LIBXML_PARSEHUGE
@$document->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD | LIBXML_PARSEHUGE);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parser fails for lecker.de recipes: JSON-LD block not found (multi-block issue) #3224

Issue Description

Workaround / Fix

1. Fix the Root Cause (Enable cURL Decompression)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Parser fails for lecker.de recipes: JSON-LD block not found (multi-block issue) #3224

Description

Issue Description

Workaround / Fix

1. Fix the Root Cause (Enable cURL Decompression)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions