Description
The Nextcloud Cookbook app fails to import recipes from lecker.de using the URL import feature.
The application returns the error: "Für den angegeben Import konnte kein Parser gefunden werden" (No parser found for the specified import).
Investigation shows that lecker.de provides multiple JSON-LD blocks on their recipe pages (e.g., Organization, WebSite, BreadcrumbList, etc.) before the actual Recipe block. The current implementation of the Cookbook parser appears to look only at the first JSON-LD block (which is an Organization type) or fails to identify the correct source block, leading to the error.
Since the Recipe schema exists further down on the page within the JSON-LD data (at block index 7), the parser should be updated to iterate through all JSON-LD blocks to locate the one with @type: Recipe.
Reproduction
Steps to reproduce the behavior:
- Copy a URL from lecker.de (e.g.,
https://www.lecker.de/haehnchen-doener-46850.html).
- Go to the Cookbook app in Nextcloud.
- Click on 'Import'.
- Paste the URL and submit.
- See error. "Für den angegeben Import konnte kein Parser gefunden werden."
Expected behavior
The parser should iterate through all available application/ld+json blocks on the page and identify the one where @type is Recipe, allowing a successful automatic import of the recipe data.
Actual behavior
The import fails with the error message: "Für den angegeben Import konnte kein Parser gefunden werden."
Analyzing the page source reveals 8 JSON-LD blocks, and the actual recipe data is located in the 8th block (index 7). A manual file-based JSON import of this specific block works flawlessly.
Browser
All Firefox, Chrome and Edge
Versions
Nextcloud server version: Nextcloud Hub 26 Spring (34.0.1 RC2)
Cookbook version: 0.11.7
Database system: MySQL/MariaDB (Standard installation)
Issue Description
When attempting to import recipes from certain websites (e.g., those using Cloudflare or specific compression configurations), the import fails with an internal server error.
The underlying cause is a Warning: DOMDocument::loadHTML(): Bytes: 0x1F 0x8B in Entity thrown by libxml inside the HttpJsonLdParser. This happens because the remote server sends the response headers or content compressed via GZIP/Deflate, but the cookbook app's internal cURL request does not handle or declare support for decompression. As a result, DOMDocument attempts to parse raw, binary GZIP data (which starts with the magic bytes 0x1F 0x8B), causing the parser to fail.
Workaround / Fix
We resolved the issue locally by applying two changes. The first one fixes the root cause, while the second acts as a defensive buffer for large/complex HTML documents.
1. Fix the Root Cause (Enable cURL Decompression)
We added CURLOPT_ENCODING => '' to the cURL options in HtmlDownloadService.php. This explicitly tells cURL to send all compression headers it supports (Accept-Encoding: gzip, deflate, br) and automatically decompresses the content in memory before passing the clean HTML string over to PHP.
File modified: lib/Service/HtmlDownloadService.php
// Inside fetchHtmlPage(string $url) method (around line 114)
$opt = [
CURLOPT_USERAGENT => 'Mozilla/5.0 (X11; Linux x86_64; rv:129.0) Gecko/20100101 Firefox/129.0',
CURLOPT_ENCODING => '', // <-- ADDED THIS LINE to automatically handle GZIP/Deflate
];
File modified: lib/Helper/HTMLParser/HttpJsonLdParser.php
// Inside parse(\DOMDocument $document, ?string $url) method
// Change the loading configuration to include LIBXML_PARSEHUGE
@$document->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD | LIBXML_PARSEHUGE);
Description
The Nextcloud Cookbook app fails to import recipes from
lecker.deusing the URL import feature.The application returns the error: "Für den angegeben Import konnte kein Parser gefunden werden" (No parser found for the specified import).
Investigation shows that
lecker.deprovides multiple JSON-LD blocks on their recipe pages (e.g.,Organization,WebSite,BreadcrumbList, etc.) before the actualRecipeblock. The current implementation of the Cookbook parser appears to look only at the first JSON-LD block (which is anOrganizationtype) or fails to identify the correct source block, leading to the error.Since the
Recipeschema exists further down on the page within the JSON-LD data (at block index 7), the parser should be updated to iterate through all JSON-LD blocks to locate the one with@type: Recipe.Reproduction
Steps to reproduce the behavior:
https://www.lecker.de/haehnchen-doener-46850.html).Expected behavior
The parser should iterate through all available
application/ld+jsonblocks on the page and identify the one where@typeisRecipe, allowing a successful automatic import of the recipe data.Actual behavior
The import fails with the error message: "Für den angegeben Import konnte kein Parser gefunden werden."
Analyzing the page source reveals 8 JSON-LD blocks, and the actual recipe data is located in the 8th block (index 7). A manual file-based JSON import of this specific block works flawlessly.
Browser
All Firefox, Chrome and Edge
Versions
Nextcloud server version: Nextcloud Hub 26 Spring (34.0.1 RC2)
Cookbook version: 0.11.7
Database system: MySQL/MariaDB (Standard installation)
Issue Description
When attempting to import recipes from certain websites (e.g., those using Cloudflare or specific compression configurations), the import fails with an internal server error.
The underlying cause is a
Warning: DOMDocument::loadHTML(): Bytes: 0x1F 0x8B in Entitythrown bylibxmlinside theHttpJsonLdParser. This happens because the remote server sends the response headers or content compressed via GZIP/Deflate, but the cookbook app's internal cURL request does not handle or declare support for decompression. As a result,DOMDocumentattempts to parse raw, binary GZIP data (which starts with the magic bytes0x1F 0x8B), causing the parser to fail.Workaround / Fix
We resolved the issue locally by applying two changes. The first one fixes the root cause, while the second acts as a defensive buffer for large/complex HTML documents.
1. Fix the Root Cause (Enable cURL Decompression)
We added
CURLOPT_ENCODING => ''to the cURL options inHtmlDownloadService.php. This explicitly tells cURL to send all compression headers it supports (Accept-Encoding: gzip, deflate, br) and automatically decompresses the content in memory before passing the clean HTML string over to PHP.File modified:
lib/Service/HtmlDownloadService.php