Skip to content

Feature/knowledge agent#18

Merged
amrit110 merged 23 commits intomainfrom
feature/knowledge-agent
Jan 30, 2026
Merged

Feature/knowledge agent#18
amrit110 merged 23 commits intomainfrom
feature/knowledge-agent

Conversation

@amrit110
Copy link
Member

@amrit110 amrit110 commented Jan 27, 2026

This pull request introduces a significant refactor and expansion of the aieng.agent_evals package, focusing on establishing a robust, modular, and well-documented foundation for agent evaluation and knowledge-grounded QA workflows. It adds new configuration, display, and grounding utilities, provides a comprehensive API for the knowledge agent, and removes outdated example implementations. It also updates environment variable management and pre-commit hooks for improved developer experience.

Key changes:

Agent Evaluation and Knowledge Agent Core

  • Introduced a new aieng.agent_evals.display module providing rich, reusable display utilities for evaluation outputs, including functions for displaying responses, comparisons, metrics, and messages in Jupyter notebooks using the rich library.
  • Added a new aieng.agent_evals.knowledge_agent package, exposing the KnowledgeGroundedAgent, configuration, evaluation classes, session management, and tracing utilities via a clear API in __init__.py.
  • Implemented a centralized configuration system for the knowledge agent using a Pydantic KnowledgeAgentConfig class, supporting environment variables and .env file loading.
  • Added a grounding_tool.py module defining the GroundedResponse and GroundingChunk models, a factory for a Google Search tool for agent grounding, and a utility to format responses with inline citations.

Package Structure and Cleanup

  • Removed outdated example implementation modules from aieng.agent_evals.impl, including example_impl.py and its __init__.py. [1] [2]
  • Created a new aieng.agent_evals.__init__.py that exposes display utilities and provides a clear package-level docstring.

Developer Experience

  • Updated .env.example to provide clear documentation and defaults for environment variables required for Gemini/OpenAI-compatible LLMs and LangFuse tracing.
  • Refined .pre-commit-config.yaml to use the correct ruff-check hook and expanded ignored error codes for nbqa-ruff to reduce unnecessary linting noise. [1] [2]

@amrit110 amrit110 self-assigned this Jan 27, 2026
@amrit110 amrit110 added the enhancement New feature or request label Jan 27, 2026
@amrit110 amrit110 requested review from fcogidi and lotif January 29, 2026 03:52
@amrit110 amrit110 marked this pull request as ready for review January 29, 2026 03:53
tool_calls: list[dict] = Field(default_factory=list)


def create_google_search_tool() -> GoogleSearchTool:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use a wrapper instead of providing the tool directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because i'm setting bypass_multi_tools_limit=True, just wanted to be intentful, see docstring.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to have a tools package under aieng.agent_evals to hold other tools from other agents or to have it under aieng.agent_evals.<agent>.tools?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not really sure at this point. We don't want code duplication, but i don't know to what extent we can use the same tools across agents as well. But for tracing and evals, its cleaner and more consistent if we had a single tools package. @lotif what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw your draft PR a bit @fcogidi, i think the shared tools package isn't a bad idea. I can align this PR towards that as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see what you guys are using and change my implementation. At some point I'll try to switch to Google ADK as well. Even though I think the langfuse evals I'm running are really handy and easy I still need to see in more details what you guys are doing.

>>> if init_tracing():
... print("Tracing enabled!")
"""
global _instrumented, _langfuse_client # noqa: PLW0603
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a langfuse client into the AsyncClientManager class in my PR, but it's debatable if it really belongs there. We should stick to one solution, and I vote not to use global variables. AsycClientManager is a singleton which is a slightly cleaner solution.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I've extended your AsyncClientManager to work for me as well, and so its now compatible with your PR. Also removed use of global variables and align with singleton pattern.

@amrit110 amrit110 requested review from fcogidi and lotif January 29, 2026 22:04
Copy link
Collaborator

@lotif lotif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments :)

@amrit110 amrit110 merged commit 412298a into main Jan 30, 2026
3 checks passed
@amrit110 amrit110 deleted the feature/knowledge-agent branch January 30, 2026 18:29
lotif added a commit that referenced this pull request Jan 30, 2026
commit 4507d52
Merge: b4e124d 412298a
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Fri Jan 30 16:58:09 2026 -0500

    Merge branch 'main' into marcelo/langfuse-integration

commit 412298a
Author: Amrit Krishnan <amrit110@gmail.com>
Date:   Fri Jan 30 13:29:20 2026 -0500

    Feature/knowledge agent (#18)

    * Add initial working implementation using search grounding

    * [pre-commit.ci] Add auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * Remove example implementation

    * Fix GHSA-wp53-j4wj-2cfg, pin python-multipart version

    * Update agent to ReAct, fix grounding tool

    * Update README.md

    * Add tracing to langfuse

    * Clear notebook cells

    * Remove python-multipart as direct dependency and only update it

    * Remove D103 and E402 from being ignored in pre-commit check and fix notebooks

    * Move imports to top of the file

    * Simplify tracing module to just read directly from env variables

    * Rename async client manager for agent, reuse existing async client manager for tracing

    * Clarify optional dataset variable in docstring

    * Fix format_response_with_citations

    * Return results instead of modifying input params

    * Use pydantic native desc docstring instead of numpy style

    * Unify config to use same across agents

    * Use ADK's session management, remove custom implementation

    * Remove weaviate from client manager

    ---------

    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

commit b4e124d
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Thu Jan 29 17:00:58 2026 -0500

    Small fixes, additional logging and updated groud truth

commit 7d59004
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Wed Jan 28 16:30:23 2026 -0500

    Upgrading python-multipart + small improvements

commit 2906b36
Merge: 285591b bba7326
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Wed Jan 28 16:12:03 2026 -0500

    Merge branch 'main' into marcelo/langfuse-integration

commit 285591b
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Wed Jan 28 16:09:16 2026 -0500

    Adding readme instructions

commit 37348c0
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Wed Jan 28 15:53:47 2026 -0500

    Minor improvements

commit 9fdc71d
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Wed Jan 28 15:46:25 2026 -0500

    Addingh evaluator and retry mechanism

commit 5af7152
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Wed Jan 28 14:41:36 2026 -0500

    Using langfuse to upload a dataset and run the evaluation

commit c1980fe
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Wed Jan 28 12:39:33 2026 -0500

    Adding the eval dataset and making changes to the eval script. Adding tenacity for retrying mechanism

commit 02c3ac5
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Tue Jan 27 16:57:37 2026 -0500

    Added code comments

commit da9b0c9
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Tue Jan 27 16:50:06 2026 -0500

    Finished using LLMs to evaluate result

commit f0af403
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Tue Jan 27 13:51:06 2026 -0500

    Moving forward with the evaluation script + some more refactorings

commit 93ee157
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Tue Jan 27 11:36:52 2026 -0500

    Reporting to langfuse and removed clutter

commit d029285
Merge: a39ac1d 9549395
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Tue Jan 27 11:00:28 2026 -0500

    Merge branch 'main' into marcelo/langfuse-integration

commit a39ac1d
Merge: cdf0647 efd80cb
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Mon Jan 26 17:09:09 2026 -0500

    Merge branch 'marcelo/report-agent' into marcelo/langfuse-integration

commit efd80cb
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Mon Jan 26 17:03:37 2026 -0500

    CR by Franklin

commit 7a2a57f
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Mon Jan 26 16:31:49 2026 -0500

    CR by Franklin

commit cdf0647
Merge: 53d0589 534f8e5
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Mon Jan 26 16:25:19 2026 -0500

    Merge branch 'marcelo/report-agent' into marcelo/langfuse-integration

commit 534f8e5
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Mon Jan 26 16:19:30 2026 -0500

    CR by Franklin

commit 53d0589
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Mon Jan 26 16:07:33 2026 -0500

    Some more langfuse things

commit 40dfc6f
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Mon Jan 26 13:42:41 2026 -0500

    Parsing client responses into langfuse traces

commit 20e4ec5
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Mon Jan 26 11:42:38 2026 -0500

    Small refactor

commit ee8b854
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Mon Jan 26 11:36:14 2026 -0500

    Moving env and logging config to the top of the file

commit 66a4494
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Mon Jan 26 11:13:05 2026 -0500

    CR by Amrit

commit f9d7862
Merge: dc02ff2 9042ace
Author: Marcelo Lotif <lotif@users.noreply.github.com>
Date:   Mon Jan 26 11:12:42 2026 -0500

    Merge branch 'main' into marcelo/report-agent

commit dc02ff2
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Fri Jan 23 12:56:56 2026 -0500

    Grammar fixes

commit 530360e
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Fri Jan 23 12:42:50 2026 -0500

    Adding a couple more vulnerabilities to the skip list

commit 7bb081f
Merge: 6e3c4c2 bd34ef0
Author: Marcelo Lotif <lotif@users.noreply.github.com>
Date:   Fri Jan 23 12:37:19 2026 -0500

    Merge branch 'main' into marcelo/report-agent

commit 6e3c4c2
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Fri Jan 23 12:35:08 2026 -0500

    One more readme paragraph

commit 37b4000
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Fri Jan 23 12:27:23 2026 -0500

    Movign files around, adding the ddl file and the import script

commit 3458565
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Thu Jan 22 14:39:47 2026 -0500

    Generating xlsx reports

commit 22fc569
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Thu Jan 22 12:28:40 2026 -0500

    Adding more report examples

commit 6592a1c
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Thu Jan 22 11:55:41 2026 -0500

    Deleting weaviate stuff, using Online Retail dataset instead

commit 0098f7d
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Thu Jan 22 11:37:51 2026 -0500

    Weaviate local and remote scripts

commit 9e6ce2e
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Wed Jan 21 11:47:00 2026 -0500

    Adding data import for the online retail dataset and some more instructions

commit a77a60f
Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai>
Date:   Fri Jan 16 17:56:15 2026 -0500

    WIp trying to make it work
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants