How to create a collection

How to Create a Knowledge Base Collection

1. Introduction: What is a Collection?

A Collection is the knowledge base or source of truth for a Knowledge Agent. It is a curated set of documents, articles, and content that the AI will use to answer user questions. Think of it as the library your agent reads from.

2. Accessing the Collections Section

Log in to your Bridged account.
From the main navigation sidebar, click on Collections .
You will see the Collections dashboard with a list of your existing collections and their details (e.g., number of pages, media documents, creation date).

To start a new collection, click the + Create Collections button at the top left.

3. The 4-Step Creation Process

Creating a new collection is a guided 4-step process: Goal → Engine → Crawl → Publish.

Step 1: Goal – Select Collection Type

You will be prompted to choose the type of collection you want to create.
Select Knowledge Agent Collection . This tells the system that this collection will be used as a knowledge base for answering questions.
Click Next to proceed.

Step 2: Engine – Name Your Collection

Collection title: Enter a unique and descriptive name (e.g., "KA for BE website," "Product Documentation 2025").
Indexed pages: You may see a default domain or be asked to enter the primary website URL you will be working with (e.g., https://bridged.events).
Click Next to continue.

Step 3: Crawl – Adding Content to Your Collection

This is the core step where you populate your collection with content. You have three primary methods to choose from:

A. Integrate with WordPress Plugin

This option is for users who have a WordPress site. By installing the Bridged WordPress plugin, you can establish a direct connection, allowing for seamless and automatic content syncing.

B. Utilize XML Sitemap Integration

This is the most common method for pulling content from a website. You provide your sitemap URL(s), and Bridged crawls them to index your pages.

How to Find Your Sitemap XML Using robots.txt:
If you don't know your sitemap URL, the easiest way to find it is by checking your website's `robots.txt` file. This file tells search engines (and Bridged) where to find your sitemaps.

Step-by-step

Open a web browser.
In the address bar, type your website's domain name followed by /robots.txt.
Example: https://bridged.events/robots.txt
Press Enter. You will see a text file with rules for web crawlers.
Look for a line that starts with Sitemap:. It will look something like this:
Sitemap: https://bridged.events/sitemap.xml
Sitemap: https://bridged.events/sitemap_index.xml
Copy the full URL listed after Sitemap:

Adding Your Sitemap to Bridged:

Select the Utilize XML Sitemap Integration option.
You will see a section for Enter your sitemap urls .
You can choose between:
- Current sitemaps: If Bridged auto-detects any.
- Manually add sitemaps: Click this to enter your sitemap URL manually.
Paste the sitemap URL you found (e.g., https://bridged.events/sitemap.xml) into the provided field.
Click Add . The URL will appear in the list below.
Once added, Bridged will begin crawling the sitemap. The status will show as Crawling .
After crawling, you will see a list of all the pages found in the sitemap. You can then:
- Select manually: Hand-pick specific pages to include in your collection.
- Select pages with certain criteria: Use filters to include pages based on rules (e.g., all pages in the /blog/ section).
- Use the Select all pages option at the bottom to include everything.

Note: The crawling process might take up to 5 minutes. make sure not to close this tab otherwise the process will be terminated

C. Manual Upload

This method is for adding content that isn't publicly crawlable or for supplementing your collection with offline documents.

Select the Manual Upload option.
You will have two tabs or areas:
- Upload page URLs: Enter individual webpage URLs one by one. Type or paste the URL and click Add .
- Upload files: Click or drag and drop media files (like PDFs) into the upload area.
  - Note: The maximum file size is 50 MB.
  - For each file, you can select the Language and add an optional Media reference URL .
Once all your URLs and files are added, they are ready for processing.

The files you upload (PDFs, etc.) are stored and will be publicly visible. However, users will see a reference link for each document. The "Media reference URL" field allows you to override this and provide a custom public link (e.g., a signup page or external article) instead of the internal file.

Step 4: Publish – Finalize Your Collection

After successfully crawling/uploading your content, you will see an overview of your collection.
This screen shows the Content Mapping Progress (e.g., "26/26 items mapped").
You can review:
- Target Urls: A list of all the web pages added, including their author, date, and mapping status.
- Media Files: A list of all uploaded documents, including file name, reference URL, file type, and upload date.
From here, you can:
- + Add Urls: Add more content.
- Refresh Urls: Re-crawl existing URLs to update the content.
- Delete Urls / Delete Media: Remove unwanted items.
Once you are satisfied, you can proceed. The final button often says Next: Create Agent , linking you directly to creating a Knowledge Agent using this new collection.

4. Using Your Collection with a Knowledge Agent

After your Collection is published, it will appear in the Select the Collection section on the design step when you are configuring a Knowledge Agent. Simply choose this collection, and your agent will use it as its source of truth.

How to Find Your Sitemap XML Using robots.txt:If you don't know your sitemap URL, the easiest way to find it is by checking your website's robots.txt file. This file tells search engines (and Bridged) where to find your sitemaps.

Note: The crawling process might take up to 5 minutes. make sure not to close this tab otherwise the process will be terminated

The files you upload (PDFs, etc.) are stored and will be publicly visible. However, users will see a reference link for each document. The "Media reference URL" field allows you to override this and provide a custom public link (e.g., a signup page or external article) instead of the internal file.

You can add up to 3 different collections to a Knowledge agent

How to Find Your Sitemap XML Using robots.txt:
If you don't know your sitemap URL, the easiest way to find it is by checking your website's `robots.txt` file. This file tells search engines (and Bridged) where to find your sitemaps.