Skip to main content

📄 Choose the Documents to Ingest into Your Knowledge Base

Feeding your knowledge base is a strategic decision: what you integrate directly determines what Didask AI will be able to do for your collaborators.

Written by Océane

⏱️ The Essentials in 3 Minutes

• Start by identifying your learners' actual needs before importing anything.
• A document should only be integrated if it addresses an existing need, is up to date, and someone is responsible for maintaining it.
• Define access rights based on the scope of users concerned by each piece of information.
• A poor document in the base can degrade AI responses just as much as missing information.


🧠 Understand the Value of Choosing Documents Carefully

Every document integrated into the knowledge base influences what the AI can answer, and how. An outdated, redundant, or audience-inappropriate document can generate incorrect or off-topic responses, even if your base otherwise contains quality content.

Choosing documents methodically ensures the AI becomes a genuine skills assistant for your teams, rather than a source of confusion. It also ensures your knowledge base-building effort is sustainable: a small, well-maintained base always outperforms a large, unmaintained one.


🎯 Define Your Learners' Needs

Before importing anything, clarify the situations in which the AI needs to be useful.

Recurring needs:

  • On which topics do your collaborators come to you most often (questions in meetings, messages, emails)?

  • When someone joins your team, what do they lose time on during the first few weeks?

Grey areas and edge cases:

  • Are there recurring decisions where your collaborators do not always know which rule to apply?

  • Are there nuances or exceptions that only long-standing team members know?

  • Are there topics where practices vary from person to person, due to a lack of a common reference?


🔍 Identify Documents That Address These Needs

For each identified need, ask yourself:

On whether the document exists:

  • Is there already reliable information addressing this need somewhere (procedure, guide, internal policy, FAQ, presentation)?

  • Is there a reference answer given verbally but never written down? If so, it should be documented before being integrated.

On the access scope:

  • Does this information concern all your learners, or only a specific population (managers, a team, a particular training space)?

  • Is there a risk in making this information accessible to everyone (sensitive data, HR information, content reserved for certain roles)?

If the information only concerns a subset of your learners, restrict the document's access rights at import time.


✅ Check Document Quality Before Ingesting

A poor document in the base can degrade response quality just as much as missing information.

Reliability:

  • Is the document up to date and does it reflect current practices?

  • Is it understandable on its own, without verbal explanation or additional context?

  • Is there another document on the same topic in the base? If so, are the two consistent? Can the new document be restricted to additional information only, to avoid duplication?

Sustainability:

  • Is the content stable over time, or likely to change frequently (pricing, contacts, ongoing decisions)?

  • If the document evolves, who will be responsible for updating it in the base?


🧪 Apply the 3-Question Test

Before any import, check these three criteria:

  1. A learner has already needed this information, or will likely need it.

  2. The information is correct, up to date, and self-contained: it stands on its own.

  3. Someone is responsible for maintaining it: the base degrades if no one is in charge.

If you answer "yes" to all three: import. Otherwise, take the time to correct before integrating.


Keywords: document selection, ingestion, knowledge base, Didask AI, quality, access rights, document strategy.

Did this answer your question?