Skip to content

Conversation

@richet
Copy link

@richet richet commented Aug 29, 2025

What this does

  • Removes the use of the Invoke endpoint
  • Opens up the use of most of the Bedrock models instead of just having Anthropic.
  • Allows for document uploads on endpoints that dont support it via the Invoke API. e.g. claude-3-haiku

Type of change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation
  • Performance improvement

Scope check

  • I read the Contributing Guide
  • This aligns with RubyLLM's focus on LLM communication
  • This isn't application-specific logic that belongs in user code
  • This benefits most users, not just my specific use case

Quality check

  • I ran overcommit --install and all hooks pass
  • I tested my changes thoroughly
  • I updated documentation if needed
  • I didn't modify auto-generated files manually (models.json, aliases.json)

API changes

  • Breaking change
  • New public methods/classes
  • Changed method signatures
  • No API changes

Related issues

Copy link
Owner

@crmne crmne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding Converse API is a great idea but this PR has significant issues:

  1. No tests!
  2. Code complexity
  3. It doesn't follow the lead of the other providers in terms of method names, modules, etc.
  4. Overcommit was not installed

@richet richet marked this pull request as draft September 3, 2025 18:57
@richet richet marked this pull request as ready for review September 12, 2025 06:35
@richet richet changed the title WIP - Use Converse API for Bedrock provider Use Converse API for Bedrock provider Sep 12, 2025
@richet
Copy link
Author

richet commented Sep 12, 2025

@crmne Several changes made per your comments. Well tested and ready for review.

@codecov
Copy link

codecov bot commented Sep 14, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.57%. Comparing base (4ff2231) to head (2d460e2).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #377   +/-   ##
=======================================
  Coverage   84.57%   84.57%           
=======================================
  Files          37       37           
  Lines        1932     1932           
  Branches      499      499           
=======================================
  Hits         1634     1634           
  Misses        298      298           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Owner

@crmne crmne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big patch @richet so thank you for the effort but there are still significant changes needed:

I still see that the organization of the code, especially in chat.rb, doesn't respect the organization of the other providers. There are a ton of methods there that belong in other modules in a provider implementation. Check the OpenAI provider for an example of what belongs where. I'll resume reviewing it when that's done.

Thank you again for the monumental effort.

@richet
Copy link
Author

richet commented Sep 16, 2025

@crmne Thanks for the review and pushing me to clean this up a bit more. I renamed and cleaned up a lot of the methods to a point where I think its close to what you have in OpenAI.

@richet
Copy link
Author

richet commented Oct 8, 2025

@crmne The VCR cassettes seem to conflict when your main branch is updated so I have fixed them again and hopefully this one could be reviewed again soon 🤞

@michaeldiscala
Copy link

Hi @richet and @crmne 👋 we're very interested in using Bedrock with RubyLLM and would love to help move this forward if we can. Are there any ways we could pitch in? thank you!

RSpec.describe RubyLLM::Chat do
include_context 'with configured RubyLLM'

# Helper to mitigate Bedrock rate limits in CI by retrying with backoff
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need this and the test delays when recording VCRs right? Would be nice to isolate it somehow. Bedrock is a bit of a pain isn't it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thats correct. This PR also keeps getting merge conflicts due to the cassettes also.

@tpaulshippy
Copy link
Contributor

Hi @richet and @crmne 👋 we're very interested in using Bedrock with RubyLLM and would love to help move this forward if we can. Are there any ways we could pitch in? thank you!

Which models in bedrock are you looking to use? Just curious.

@richet
Copy link
Author

richet commented Nov 18, 2025

Which models in bedrock are you looking to use? Just curious.

I've been running this fork in prod for about a month now using Haiku 3.5 and Sonnet 4 in the AWS APAC region.

@tpaulshippy
Copy link
Contributor

Which models in bedrock are you looking to use? Just curious.

I've been running this fork in prod for about a month now using Haiku 3.5 and Sonnet 4 in the AWS APAC region.

I use those models on Bedrock with RubyLLM without these changes. Isn't this for other non-Anthropic models?

@richet
Copy link
Author

richet commented Nov 18, 2025

I havent looked at the source for the last month but unless something has changed it is currently pinned to the US region models. In my case we have to use APAC. We also found that our rate limits using the Converse endpoint are larger than Invoke which are the main changes this PR makes. The ability to use all of the other models due to the Converse endpoint usage is a bonus.

@tpaulshippy
Copy link
Contributor

FYI and for others, #338 did get merged.

@bensheldon
Copy link

I'm a longtime lurker in this thread 😁

Which models in bedrock are you looking to use? Just curious.

I'd like to use the Amazon Nova models. I'm already using them directly via AWS SDk / Converse and they're good enough my uses (summarization, translation, vision model) for how inexpensive they are.

@tpaulshippy
Copy link
Contributor

Awesome. Considering bringing this into my fork soon and just trying to gauge interest.

@michaeldiscala
Copy link

Which models in bedrock are you looking to use? Just curious

We are also primarily looking to use the Anthropic models, so we can also access them via the current invoke implementation.

As context for my original question - we’d like to build on top of Ruby LLM, and there are a couple of features we were hoping to contribute back. Before we start that work, we want to make sure we’re targeting the right baseline. Since this PR is a significant change for the Bedrock provider, it'd be ideal if this was merged in first.

After researching a bit more though, I'm realizing the features that are most pressing for us may actually be accessible via the current invoke_model implementation (guard rails and cross region inference profiles) so our needs may not actually push toward the converse API.

We would still love to help move the bedrock provider forward though -- regardless of which API endpoint makes the most sense for RubyLLM's broader goals -- and will keep following along. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants