vkryukov · August 26, 2025 00:40
diff --git a/claude code - opus 4.1 b/claude code - opus 4.1
 Based on my analysis of your library, here are comprehensive improvement suggestions organized by priority and impact:

  🎯 High Priority Improvements

  1. Enhanced Error Handling & Resilience

  - Add automatic retry with exponential backoff for transient failures (429, 500, 503 errors)
  - Implement circuit breaker pattern to prevent cascading failures
  - Add configurable timeout strategies per operation type
  - Provide better error context with request details in errors

  2. Type Safety & Specifications

  - Add comprehensive typespecs to all public functions
  - Define custom types for commonly used structures (e.g., @type model :: String.t())
  - Add dialyzer checks to CI pipeline
  - Create behavior modules for extensibility points

  3. Observability & Monitoring

  - Integrate :telemetry for metrics and instrumentation
  - Add request/response logging with configurable levels
  - Track API usage metrics (tokens, costs, latencies)
  - Provide hooks for custom monitoring solutions

  🚀 Performance Optimizations

  4. Connection Management

  - Implement connection pooling using Finch instead of bare Req
  - Add HTTP/2 support for multiplexing
  - Enable keep-alive connections for better throughput
  - Add request deduplication for identical concurrent requests

  5. Rate Limiting & Throttling

  - Built-in rate limiter respecting OpenAI's limits
  - Request queuing with priority support
  - Adaptive throttling based on response headers
  - Token budget management per time window

  6. Caching Strategy

  - Response caching with configurable TTL
  - ETS-based cache for in-memory storage
  - Cache key normalization for better hit rates
  - Conditional requests using ETags

  📚 Developer Experience

  7. Enhanced Streaming API

  - Add GenStage/Flow support for backpressure handling
  - Provide Broadway producer for stream processing
  - Add stream transformation helpers (buffer, batch, throttle)
  - Implement stream resumption on connection failures

  8. Testing Improvements

  - Add property-based tests using StreamData
  - Create test helpers module for mocking responses
  - Implement contract testing against OpenAI API specs
  - Add performance benchmarks using Benchee

  9. Configuration Flexibility

  - Support multiple API keys for different contexts
  - Environment-specific configurations
  - Runtime configuration updates without restart
  - Configuration validation on startup

  🔧 Architectural Enhancements

  10. Middleware/Plugin System

  defmodule OpenAI.Responses.Middleware do
    @callback before_request(map()) :: {:ok, map()} | {:error, term()}
    @callback after_response(Response.t()) :: {:ok, Response.t()} | {:error, term()}
  end

  11. Batch Operations Support

  - Batch request API for multiple operations
  - Parallel execution with configurable concurrency
  - Transaction-like semantics for related operations

  12. Schema Validation

  - Compile-time schema validation
  - Runtime input validation before API calls
  - Better error messages for schema mismatches
  
  ⏺ Example: Retry with Exponential Backoff

  defmodule OpenAI.Responses.Retry do
    def with_retry(fun, opts \\ []) do
      max_attempts = Keyword.get(opts, :max_attempts, 3)
      base_delay = Keyword.get(opts, :base_delay, 1000)

      do_retry(fun, 1, max_attempts, base_delay)
    end

    defp do_retry(fun, attempt, max_attempts, base_delay) do
      case fun.() do
        {:error, error} when attempt < max_attempts and Error.retryable?(error) ->
          delay = base_delay * :math.pow(2, attempt - 1) |> round()
          Process.sleep(delay)
          do_retry(fun, attempt + 1, max_attempts, base_delay)

        result -> result
      end
    end
  end

  Example: Telemetry Integration

  defmodule OpenAI.Responses.Telemetry do
    def setup do
      :telemetry.attach_many(
        "openai-responses-handler",
        [
          [:openai, :request, :start],
          [:openai, :request, :stop],
          [:openai, :request, :exception]
        ],
        &handle_event/4,
        nil
      )
    end

    def emit_request_start(metadata) do
      :telemetry.execute([:openai, :request, :start], %{system_time: System.system_time()}, metadata)
    end
  end

  Example: Connection Pool with Finch

  defmodule OpenAI.Responses.Client do
    def child_spec(opts) do
      name = Keyword.get(opts, :name, __MODULE__)

      Finch.child_spec(
        name: name,
        pools: %{
          "https://api.openai.com" => [
            size: 10,
            count: 2,
            protocol: :http2,
            conn_opts: [transport_opts: [timeout: 30_000]]
          ]
        }
      )
    end
  end

  🎁 Additional Features to Consider

  1. Assistant API Support - Add first-class support for OpenAI Assistants
  2. File Upload Handling - Streamlined file upload for vision/document processing
  3. Token Counting - Pre-flight token counting to estimate costs
  4. Response Validation - Validate API responses against expected schemas
  5. Async Job Queue - Background job processing for long-running operations
  6. WebSocket Support - For real-time streaming (when OpenAI adds it)
  7. GraphQL-like Query Language - Simplified response field selection
  8. Response Transformers - Pluggable response transformation pipeline

  📊 Metrics to Track

  - Request latency percentiles (p50, p95, p99)
  - Token usage by model and operation
  - Error rates by error type
  - Cache hit rates
  - Retry success rates
  - Cost per operation type

  🏗️ Implementation Priority

  1. Phase 1 (Quick wins): Type specs, basic retry logic, telemetry
  2. Phase 2 (Core improvements): Connection pooling, rate limiting, enhanced error handling
  3. Phase 3 (Advanced features): Caching, middleware system, batch operations
  4. Phase 4 (Ecosystem): Testing helpers, monitoring dashboards, example applications

  These improvements would transform your library into a production-grade, enterprise-ready OpenAI client for Elixir applications while maintaining its current simplicity and ease of use.
diff --git a/codex cli - gpt-5-high b/codex cli - gpt-5-high
 Here’s a focused architectural review with concrete, high‑impact improvements, tied to specific modules and flows.

 Strengths

 - Clear API surface: create/1,2,!, stream/1, run/2, call_functions/2, list_models/0,1, request/1.
 - Good separation: core (lib/responses.ex) vs. helpers (Response, Stream, Schema, Pricing, Internal).
 - Sensible ergonomics: map/keyword/string-or-atom keys accepted; streaming helpers (delta/1, text_deltas/1, json_events/1).
 - Cost tracking via Decimal with centralized pricing.
 - Function-calling loop (run/2) is minimal and easy to extend.

 Gaps To Address

 - SSE parsing is brittle and cannot truly stop early.
 - Internal.prepare_payload/1 overwrites text settings when schema is present.
 - Option normalization is duplicated and inconsistent across modules.
 - Error handling in a few public APIs can crash or be inflexible to missing config.
 - Limited extensibility for HTTP and telemetry; retries aren’t available even when errors are retryable.
 - Types/specs are sparse; Dialyzer value could be much higher.

 Core Improvements

 - SSE robustness and control:
    - Implement a real SSE parser with buffering across chunks and support for multi-line data: records, comments, id:, and optional retry:; preserve partials between into callbacks in lib/
 responses/stream.ex.
    - Respect callback backpressure: if callback returns {:error, reason}, stop streaming (close the connection / halt into), and still surface the captured final response if present.
    - Make @streaming_timeout configurable (e.g., stream_timeout option, default via Application.get_env/3).
    - Make @streaming_timeout configurable (e.g., stream_timeout option, default via Application.get_env/3).
 -
 Preserve text settings alongside schema:
    - In OpenAI.Responses.Internal.prepare_payload/1, merge text: %{format: ...} into existing text instead of replacing it. Today, schema clobbers user text options (e.g. verbosity).
    - Same for create/2 preserved options: ensure downstream schema usage doesn’t silently discard preserved text fields.
 -
 Unify option normalization:
    - Introduce OpenAI.Responses.Options to normalize and access options once (stringify keys, convert keyword to map, deep-normalize). Then replace get_option/2, has_option?/2, and ad‑hoc
 conversions in Responses and Internal.
    - Provide helpers for “preserved LLM options” merging to avoid scattering key-handling logic.
 -
 Safer configuration and error surfaces:
    - Replace System.fetch_env! in Internal.get_api_key/0 with a graceful error: if missing, return {:error, %Error{message: "Missing OPENAI_API_KEY", status: 401}} from request/1 instead of
 raising.
    - In list_models/0,1, avoid pattern matching {:ok, response} = request(...) — return {:error, err} on failure (non-breaking: add list_models!/1 if you want a bang that raises).
 -
 HTTP client adaptability and retries:
    - Extract HTTP to OpenAI.Responses.Client with behavior and a default Req adapter. Allow dependency injection (configurable adapter: ...) to mock or swap clients.
    - Add simple retry with backoff for Error.retryable?/1 in request/1 (configurable max retries/jitter).
    - Make base URL and timeouts configurable (env/config).
 -
 Streaming API ergonomics:
    - Add stream!/1 (raises), and allow Responses.stream(options, on_chunk: ..., on_error: ...) in addition to returning an Enumerable.
    - Document that Responses.Stream.text_deltas/1 and json_events/1 compose with Responses.stream/1.

 Response/Schema Enhancements

 - Response:
    - Add @type t and @specs across Response + public API (helps mix dialyzer).
    - Consider convenience fields (model, usage) mirroring body for quick access.
    - Optionally support atom-key parsed output (opt-in) from extract_json/1 to integrate easier with Elixir code, while defaulting to strings.
    - Optionally support atom-key parsed output (opt-in) from extract_json/1 to integrate easier with Elixir code, while defaulting to strings.
 -
 Schema:
    - Validate option keys (e.g., detect typos like max_items vs maxItems), and provide clear errors.
    - Factor large normalization pipeline into smaller internal modules to reduce cognitive load; keep a single entry (build_output/1, build_function/3).
    - Provide a tiny “schema dsl” helper (optional) for readability in user code.

 Function Calling

 - run/2:
    - Allow a user-provided “policy” for how to merge function outputs (e.g., append vs. replace), and a max-iterations guard.
    - Support async/multiple tool calls in parallel if the API returns several (batch function execution with Task.async_stream/3).
 - call_functions/2:
    - Add structured error outputs with machine-readable codes (not just strings), to let callers decide retry vs. abort per tool.
    - Keep current atom safety for keyword functions; optionally accept a translations map to map external names → internal atoms.

 Pricing

 - Keep static pricing but:
    - Expose price version date via Pricing.version/0.
    - Add “fallback pricing” or a warning when the model is unknown; optionally let users inject pricing for private models.
    - Consider simple memoization for get_pricing/1 (though current is constant-time).

 Telemetry and Observability

 - Emit Telemetry events:
    - [:openai_responses, :request, :start/stop/exception]
    - [:openai_responses, :stream, :chunk], :complete, :error.
    - Include fields: model, endpoint, request_id (if available), byte sizes, timings.
 - Optional log metadata for correlation IDs from OpenAI responses.

 DX, Testing, and Stability

 - Types and Specs:
    - Add @spec and @type throughout (Responses, Stream, Response, Schema, Pricing, Error).
 - Tests:
    - Unit-test the new SSE parser, especially multi-line data: and partial chunk boundaries.
    - Add tests ensuring schema merges into existing text fields.
    - Add request retry tests using a mock adapter.
    - Add tests for missing API key handling that return {:error, %Error{}} (no raise).
 - Lint/format/dialyzer:
    - Run mix credo --strict and add missing specs; keep functions narrow and pattern-match where needed as you already do.

 Smaller Code-Level Fixes

 - Responses.list_models/0,1: return {:error, error} on request failure; add list_models!/1 for raising.
 - Stream.start_streaming/1: consider linking the Task to the stream owner and gracefully cancel on cleanup_stream/1 rather than :kill, so into cleanup runs.
 - Stream.parse_stream_chunk/1: tolerate events with only data: (SSE spec allows missing event: default) and accumulate multiple data: lines per event.
 - Responses.create!/1,2: ensure raise always raises an exception struct (wrap non-exception terms in Error).

 If you’d like, I can:

 - Implement the safe text-merge with schema (low-risk, high-value).
 - Add an SSE buffer-based parser and early-stop handling.
 - Introduce a small Options module and refactor call sites incrementally.
 - Add retry/backoff to request/1 behind config flags.
 - Wire basic Telemetry around request/stream lifecycle.

 Tell me which subset you want first, and I’ll patch the code and add focused tests.
diff --git a/gemini - gemini-2.5-pro b/gemini - gemini-2.5-pro
 I've thoroughly reviewed the openai_responses library and have a clear plan for improvement. The current modular architecture is a good starting point, but I've identified
  several areas to enhance its robustness, extensibility, and maintainability.

  My plan is as follows:

  1. Core API Refactoring (`OpenAI.Responses`): I'll simplify the create functions by introducing a private do_create/1 helper to handle normalized options, reducing code
  duplication. The run/2 function will be streamlined to accept only maps, and I'll improve error messages in run!/2. A request!/1 function will be added for consistency.

  2. Configuration & Extensibility: I'll make the API base URL configurable and modularize the request logic to pave the way for supporting other OpenAI APIs like Embeddings
  and Fine-tuning, likely by introducing a shared OpenAI.Client module.

  3. Response Handling (`Response` Module): To improve efficiency, I'll implement memoization for the extract_* functions and consolidate parsing logic into a single
  parse_response/1 function.

  4. Schema Generation (`Schema` Module): I'll refactor the complex build_property and normalize_spec functions for better readability and add more robust validation for schema
  inputs.

  5. Pricing Data (`Pricing` Module): I'll create a script to automate updates to the pricing data, ensuring it stays current with OpenAI's changes.

  6. Streaming (`Stream` Module): I'll simplify the stream_with_callback/2 function by exploring alternatives to the Agent-based approach and improve error handling in
  json_events/1.

  7. Code Quality: I'll reduce boilerplate by replacing custom helpers with more generic solutions and expand the test suite to cover new features and refactoring, focusing on
  error handling and edge cases.

  I will begin with a small, safe refactoring: modifying the run function in lib/responses.ex to exclusively accept a map of options, which will improve API consistency.
diff --git a/prompt.txt b/prompt.txt
 Think deeply about the current package implementation and architecture, and suggest ways to improve it.
	Based on my analysis of your library, here are comprehensive improvement suggestions organized by priority and impact:

	🎯 High Priority Improvements

	1. Enhanced Error Handling & Resilience

	- Add automatic retry with exponential backoff for transient failures (429, 500, 503 errors)
	- Implement circuit breaker pattern to prevent cascading failures
	- Add configurable timeout strategies per operation type
	- Provide better error context with request details in errors

	2. Type Safety & Specifications

	- Add comprehensive typespecs to all public functions
	- Define custom types for commonly used structures (e.g., @type model :: String.t())
	- Add dialyzer checks to CI pipeline
	- Create behavior modules for extensibility points

	3. Observability & Monitoring

	- Integrate :telemetry for metrics and instrumentation
	- Add request/response logging with configurable levels
	- Track API usage metrics (tokens, costs, latencies)
	- Provide hooks for custom monitoring solutions

	🚀 Performance Optimizations

	4. Connection Management

	- Implement connection pooling using Finch instead of bare Req
	- Add HTTP/2 support for multiplexing
	- Enable keep-alive connections for better throughput
	- Add request deduplication for identical concurrent requests

	5. Rate Limiting & Throttling

	- Built-in rate limiter respecting OpenAI's limits
	- Request queuing with priority support
	- Adaptive throttling based on response headers
	- Token budget management per time window

	6. Caching Strategy

	- Response caching with configurable TTL
	- ETS-based cache for in-memory storage
	- Cache key normalization for better hit rates
	- Conditional requests using ETags

	📚 Developer Experience

	7. Enhanced Streaming API

	- Add GenStage/Flow support for backpressure handling
	- Provide Broadway producer for stream processing
	- Add stream transformation helpers (buffer, batch, throttle)
	- Implement stream resumption on connection failures

	8. Testing Improvements

	- Add property-based tests using StreamData
	- Create test helpers module for mocking responses
	- Implement contract testing against OpenAI API specs
	- Add performance benchmarks using Benchee

	9. Configuration Flexibility

	- Support multiple API keys for different contexts
	- Environment-specific configurations
	- Runtime configuration updates without restart
	- Configuration validation on startup

	🔧 Architectural Enhancements

	10. Middleware/Plugin System

	defmodule OpenAI.Responses.Middleware do
	@callback before_request(map()) :: {:ok, map()} \| {:error, term()}
	@callback after_response(Response.t()) :: {:ok, Response.t()} \| {:error, term()}
	end

	11. Batch Operations Support

	- Batch request API for multiple operations
	- Parallel execution with configurable concurrency
	- Transaction-like semantics for related operations

	12. Schema Validation

	- Compile-time schema validation
	- Runtime input validation before API calls
	- Better error messages for schema mismatches

	⏺ Example: Retry with Exponential Backoff

	defmodule OpenAI.Responses.Retry do
	def with_retry(fun, opts \\ []) do
	max_attempts = Keyword.get(opts, :max_attempts, 3)
	base_delay = Keyword.get(opts, :base_delay, 1000)

	do_retry(fun, 1, max_attempts, base_delay)
	end

	defp do_retry(fun, attempt, max_attempts, base_delay) do
	case fun.() do
	{:error, error} when attempt < max_attempts and Error.retryable?(error) ->
	delay = base_delay * :math.pow(2, attempt - 1) \|> round()
	Process.sleep(delay)
	do_retry(fun, attempt + 1, max_attempts, base_delay)

	result -> result
	end
	end
	end

	Example: Telemetry Integration

	defmodule OpenAI.Responses.Telemetry do
	def setup do
	:telemetry.attach_many(
	"openai-responses-handler",
	[
	[:openai, :request, :start],
	[:openai, :request, :stop],
	[:openai, :request, :exception]
	],
	&handle_event/4,
	nil
	)
	end

	def emit_request_start(metadata) do
	:telemetry.execute([:openai, :request, :start], %{system_time: System.system_time()}, metadata)
	end
	end

	Example: Connection Pool with Finch

	defmodule OpenAI.Responses.Client do
	def child_spec(opts) do
	name = Keyword.get(opts, :name, __MODULE__)

	Finch.child_spec(
	name: name,
	pools: %{
	"https://api.openai.com" => [
	size: 10,
	count: 2,
	protocol: :http2,
	conn_opts: [transport_opts: [timeout: 30_000]]
	]
	}
	)
	end
	end

	🎁 Additional Features to Consider

	1. Assistant API Support - Add first-class support for OpenAI Assistants
	2. File Upload Handling - Streamlined file upload for vision/document processing
	3. Token Counting - Pre-flight token counting to estimate costs
	4. Response Validation - Validate API responses against expected schemas
	5. Async Job Queue - Background job processing for long-running operations
	6. WebSocket Support - For real-time streaming (when OpenAI adds it)
	7. GraphQL-like Query Language - Simplified response field selection
	8. Response Transformers - Pluggable response transformation pipeline

	📊 Metrics to Track

	- Request latency percentiles (p50, p95, p99)
	- Token usage by model and operation
	- Error rates by error type
	- Cache hit rates
	- Retry success rates
	- Cost per operation type

	🏗️ Implementation Priority

	1. Phase 1 (Quick wins): Type specs, basic retry logic, telemetry
	2. Phase 2 (Core improvements): Connection pooling, rate limiting, enhanced error handling
	3. Phase 3 (Advanced features): Caching, middleware system, batch operations
	4. Phase 4 (Ecosystem): Testing helpers, monitoring dashboards, example applications

	These improvements would transform your library into a production-grade, enterprise-ready OpenAI client for Elixir applications while maintaining its current simplicity and ease of use.
	Here’s a focused architectural review with concrete, high‑impact improvements, tied to specific modules and flows.

	Strengths

	- Clear API surface: create/1,2,!, stream/1, run/2, call_functions/2, list_models/0,1, request/1.
	- Good separation: core (lib/responses.ex) vs. helpers (Response, Stream, Schema, Pricing, Internal).
	- Sensible ergonomics: map/keyword/string-or-atom keys accepted; streaming helpers (delta/1, text_deltas/1, json_events/1).
	- Cost tracking via Decimal with centralized pricing.
	- Function-calling loop (run/2) is minimal and easy to extend.

	Gaps To Address

	- SSE parsing is brittle and cannot truly stop early.
	- Internal.prepare_payload/1 overwrites text settings when schema is present.
	- Option normalization is duplicated and inconsistent across modules.
	- Error handling in a few public APIs can crash or be inflexible to missing config.
	- Limited extensibility for HTTP and telemetry; retries aren’t available even when errors are retryable.
	- Types/specs are sparse; Dialyzer value could be much higher.

	Core Improvements

	- SSE robustness and control:
	- Implement a real SSE parser with buffering across chunks and support for multi-line data: records, comments, id:, and optional retry:; preserve partials between into callbacks in lib/
	responses/stream.ex.
	- Respect callback backpressure: if callback returns {:error, reason}, stop streaming (close the connection / halt into), and still surface the captured final response if present.
	- Make @streaming_timeout configurable (e.g., stream_timeout option, default via Application.get_env/3).
	- Make @streaming_timeout configurable (e.g., stream_timeout option, default via Application.get_env/3).
	-
	Preserve text settings alongside schema:
	- In OpenAI.Responses.Internal.prepare_payload/1, merge text: %{format: ...} into existing text instead of replacing it. Today, schema clobbers user text options (e.g. verbosity).
	- Same for create/2 preserved options: ensure downstream schema usage doesn’t silently discard preserved text fields.
	-
	Unify option normalization:
	- Introduce OpenAI.Responses.Options to normalize and access options once (stringify keys, convert keyword to map, deep-normalize). Then replace get_option/2, has_option?/2, and ad‑hoc
	conversions in Responses and Internal.
	- Provide helpers for “preserved LLM options” merging to avoid scattering key-handling logic.
	-
	Safer configuration and error surfaces:
	- Replace System.fetch_env! in Internal.get_api_key/0 with a graceful error: if missing, return {:error, %Error{message: "Missing OPENAI_API_KEY", status: 401}} from request/1 instead of
	raising.
	- In list_models/0,1, avoid pattern matching {:ok, response} = request(...) — return {:error, err} on failure (non-breaking: add list_models!/1 if you want a bang that raises).
	-
	HTTP client adaptability and retries:
	- Extract HTTP to OpenAI.Responses.Client with behavior and a default Req adapter. Allow dependency injection (configurable adapter: ...) to mock or swap clients.
	- Add simple retry with backoff for Error.retryable?/1 in request/1 (configurable max retries/jitter).
	- Make base URL and timeouts configurable (env/config).
	-
	Streaming API ergonomics:
	- Add stream!/1 (raises), and allow Responses.stream(options, on_chunk: ..., on_error: ...) in addition to returning an Enumerable.
	- Document that Responses.Stream.text_deltas/1 and json_events/1 compose with Responses.stream/1.

	Response/Schema Enhancements

	- Response:
	- Add @type t and @specs across Response + public API (helps mix dialyzer).
	- Consider convenience fields (model, usage) mirroring body for quick access.
	- Optionally support atom-key parsed output (opt-in) from extract_json/1 to integrate easier with Elixir code, while defaulting to strings.
	- Optionally support atom-key parsed output (opt-in) from extract_json/1 to integrate easier with Elixir code, while defaulting to strings.
	-
	Schema:
	- Validate option keys (e.g., detect typos like max_items vs maxItems), and provide clear errors.
	- Factor large normalization pipeline into smaller internal modules to reduce cognitive load; keep a single entry (build_output/1, build_function/3).
	- Provide a tiny “schema dsl” helper (optional) for readability in user code.

	Function Calling

	- run/2:
	- Allow a user-provided “policy” for how to merge function outputs (e.g., append vs. replace), and a max-iterations guard.
	- Support async/multiple tool calls in parallel if the API returns several (batch function execution with Task.async_stream/3).
	- call_functions/2:
	- Add structured error outputs with machine-readable codes (not just strings), to let callers decide retry vs. abort per tool.
	- Keep current atom safety for keyword functions; optionally accept a translations map to map external names → internal atoms.

	Pricing

	- Keep static pricing but:
	- Expose price version date via Pricing.version/0.
	- Add “fallback pricing” or a warning when the model is unknown; optionally let users inject pricing for private models.
	- Consider simple memoization for get_pricing/1 (though current is constant-time).

	Telemetry and Observability

	- Emit Telemetry events:
	- [:openai_responses, :request, :start/stop/exception]
	- [:openai_responses, :stream, :chunk], :complete, :error.
	- Include fields: model, endpoint, request_id (if available), byte sizes, timings.
	- Optional log metadata for correlation IDs from OpenAI responses.

	DX, Testing, and Stability

	- Types and Specs:
	- Add @spec and @type throughout (Responses, Stream, Response, Schema, Pricing, Error).
	- Tests:
	- Unit-test the new SSE parser, especially multi-line data: and partial chunk boundaries.
	- Add tests ensuring schema merges into existing text fields.
	- Add request retry tests using a mock adapter.
	- Add tests for missing API key handling that return {:error, %Error{}} (no raise).
	- Lint/format/dialyzer:
	- Run mix credo --strict and add missing specs; keep functions narrow and pattern-match where needed as you already do.

	Smaller Code-Level Fixes

	- Responses.list_models/0,1: return {:error, error} on request failure; add list_models!/1 for raising.
	- Stream.start_streaming/1: consider linking the Task to the stream owner and gracefully cancel on cleanup_stream/1 rather than :kill, so into cleanup runs.
	- Stream.parse_stream_chunk/1: tolerate events with only data: (SSE spec allows missing event: default) and accumulate multiple data: lines per event.
	- Responses.create!/1,2: ensure raise always raises an exception struct (wrap non-exception terms in Error).

	If you’d like, I can:

	- Implement the safe text-merge with schema (low-risk, high-value).
	- Add an SSE buffer-based parser and early-stop handling.
	- Introduce a small Options module and refactor call sites incrementally.
	- Add retry/backoff to request/1 behind config flags.
	- Wire basic Telemetry around request/stream lifecycle.

	Tell me which subset you want first, and I’ll patch the code and add focused tests.
	I've thoroughly reviewed the openai_responses library and have a clear plan for improvement. The current modular architecture is a good starting point, but I've identified
	several areas to enhance its robustness, extensibility, and maintainability.

	My plan is as follows:

	1. Core API Refactoring (`OpenAI.Responses`): I'll simplify the create functions by introducing a private do_create/1 helper to handle normalized options, reducing code
	duplication. The run/2 function will be streamlined to accept only maps, and I'll improve error messages in run!/2. A request!/1 function will be added for consistency.

	2. Configuration & Extensibility: I'll make the API base URL configurable and modularize the request logic to pave the way for supporting other OpenAI APIs like Embeddings
	and Fine-tuning, likely by introducing a shared OpenAI.Client module.

	3. Response Handling (`Response` Module): To improve efficiency, I'll implement memoization for the extract_* functions and consolidate parsing logic into a single
	parse_response/1 function.

	4. Schema Generation (`Schema` Module): I'll refactor the complex build_property and normalize_spec functions for better readability and add more robust validation for schema
	inputs.

	5. Pricing Data (`Pricing` Module): I'll create a script to automate updates to the pricing data, ensuring it stays current with OpenAI's changes.

	6. Streaming (`Stream` Module): I'll simplify the stream_with_callback/2 function by exploring alternatives to the Agent-based approach and improve error handling in
	json_events/1.

	7. Code Quality: I'll reduce boilerplate by replacing custom helpers with more generic solutions and expand the test suite to cover new features and refactoring, focusing on
	error handling and edge cases.

	I will begin with a small, safe refactoring: modifying the run function in lib/responses.ex to exclusively accept a map of options, which will improve API consistency.