How to Create a Knowledge Base Collection
1. Introduction: What is a Collection?
A Collection is the knowledge base or source of truth for a Knowledge Agent. It is a curated set of documents, articles, and content that the AI will use to answer user questions. Think of it as the library your agent reads from.
2. Accessing the Collections Section
Log in to your Bridged account.
From the main navigation sidebar, click on
Collections.You will see the Collections dashboard with a list of your existing collections and their details (e.g., number of pages, media documents, creation date).
To start a new collection, click the + Create Collections button at the top left.
3. The 4-Step Creation Process
Creating a new collection is a guided 4-step process: Goal → Engine → Crawl → Publish.
Step 1: Goal – Select Collection Type
You will be prompted to choose the type of collection you want to create.
Select
Knowledge Agent Collection. This tells the system that this collection will be used as a knowledge base for answering questions.Click
Nextto proceed.
Step 2: Engine – Name Your Collection
Collection title: Enter a unique and descriptive name (e.g., "KA for BE website," "Product Documentation 2025").
Indexed pages: You may see a default domain or be asked to enter the primary website URL you will be working with (e.g.,
https://bridged.events).Click
Nextto continue.
Step 3: Crawl – Adding Content to Your Collection
This is the core step where you populate your collection with content. You have three primary methods to choose from:
A. Integrate with WordPress Plugin
This option is for users who have a WordPress site. By installing the Bridged WordPress plugin, you can establish a direct connection, allowing for seamless and automatic content syncing.
B. Utilize XML Sitemap Integration
This is the most common method for pulling content from a website. You provide your sitemap URL(s), and Bridged crawls them to index your pages.
How to Find Your Sitemap XML Using robots.txt:
If you don't know your sitemap URL, the easiest way to find it is by checking your website's robots.txt file. This file tells search engines (and Bridged) where to find your sitemaps.
Open a web browser.
In the address bar, type your website's domain name followed by
/robots.txt.Example:
https://bridged.events/robots.txtPress Enter. You will see a text file with rules for web crawlers.
Look for a line that starts with
Sitemap:. It will look something like this:Sitemap: https://bridged.events/sitemap.xmlSitemap: https://bridged.events/sitemap_index.xmlCopy the full URL listed after
Sitemap:
Adding Your Sitemap to Bridged:
Select the
Utilize XML Sitemap Integrationoption.You will see a section for
Enter your sitemap urls.You can choose between:
Current sitemaps: If Bridged auto-detects any.
Manually add sitemaps: Click this to enter your sitemap URL manually.
Paste the sitemap URL you found (e.g.,
https://bridged.events/sitemap.xml) into the provided field.Click
Add. The URL will appear in the list below.Once added, Bridged will begin crawling the sitemap. The status will show as
Crawling.After crawling, you will see a list of all the pages found in the sitemap. You can then:
Select manually: Hand-pick specific pages to include in your collection.
Select pages with certain criteria: Use filters to include pages based on rules (e.g., all pages in the
/blog/section).Use the
Select all pagesoption at the bottom to include everything.
Note: The crawling process might take up to 5 minutes. make sure not to close this tab otherwise the process will be terminated
C. Manual Upload
This method is for adding content that isn't publicly crawlable or for supplementing your collection with offline documents.
Select the
Manual Uploadoption.You will have two tabs or areas:
Upload page URLs: Enter individual webpage URLs one by one. Type or paste the URL and click
Add.Upload files: Click or drag and drop media files (like PDFs) into the upload area.
Note: The maximum file size is 50 MB.
For each file, you can select the Language and add an optional Media reference URL .
Once all your URLs and files are added, they are ready for processing.
The files you upload (PDFs, etc.) are stored and will be publicly visible. However, users will see a reference link for each document. The "Media reference URL" field allows you to override this and provide a custom public link (e.g., a signup page or external article) instead of the internal file.
Step 4: Publish – Finalize Your Collection
After successfully crawling/uploading your content, you will see an overview of your collection.
This screen shows the Content Mapping Progress (e.g., "26/26 items mapped").
You can review:
Target Urls: A list of all the web pages added, including their author, date, and mapping status.
Media Files: A list of all uploaded documents, including file name, reference URL, file type, and upload date.
From here, you can:
+ Add Urls: Add more content.
Refresh Urls: Re-crawl existing URLs to update the content.
Delete Urls / Delete Media: Remove unwanted items.
Once you are satisfied, you can proceed. The final button often says
Next: Create Agent, linking you directly to creating a Knowledge Agent using this new collection.
4. Using Your Collection with a Knowledge Agent
After your Collection is published, it will appear in the Select the Collection section on the design step when you are configuring a Knowledge Agent. Simply choose this collection, and your agent will use it as its source of truth.