Repsoft Document Extractor API
Enterprise-grade document processing API for automated data extraction. Seamlessly integrate with your ERP systems for intelligent document automation.
Quick Start
Overview
The Repsoft Document Extractor API provides enterprise-grade document processing capabilities with asynchronous workflows. Submit documents for intelligent data extraction and receive immediate tracking identifiers for status monitoring.
Base URL
http://hub.api.cefero.com
Core Capabilities
- High-Volume Processing: Handle files up to 500MB with optimized upload speeds
- Asynchronous Architecture: Non-blocking operations with immediate tracking ID response
- Real-Time Monitoring: Poll-based status updates for process visibility
- Intelligent Queuing: Automatic load balancing and priority handling
- Structured Results: JSON-formatted extracted data ready for ERP integration
- Enterprise Security: Bearer token authentication with optional client key override
Authentication
All API requests require authentication headers for authorization and client identification. These headers must be included in every request.
Don't have API credentials yet? Contact us to request API access and we'll set up your account with the necessary authentication tokens.
Required Headers
| Header Name | Value | Description |
|---|---|---|
Authorization |
cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw |
Production API authentication token |
X-ClientId |
<Your Company Name> |
Your organization identifier |
Example Request Headers
Authorization: cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw
X-ClientId: Your Company Name
Replace <Your Company Name> with your actual organization name. Both headers are required for all API endpoints.
Store the Authorization token securely using environment variables or secret management systems. Never commit credentials to version control or expose them in client-side code.
Upload File
POST /v1/files
Submit a file for processing. Returns immediately with a tracking ID and queues the file for extraction.
Request Headers
| Header | Type | Required | Description |
|---|---|---|---|
Authorization |
string | Required | API authentication token |
X-ClientId |
string | Required | Your organization identifier |
Request Body (multipart/form-data)
| Field | Type | Required | Description |
|---|---|---|---|
UploadFiles |
file | Required | File to upload (max 500MB) |
Example Request
POST /v1/files HTTP/1.1
Host: hub.api.cefero.com
Authorization: cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw
X-ClientId: Your Company Name
Content-Type: multipart/form-data; boundary=----WebKitFormBoundary
------WebKitFormBoundary
Content-Disposition: form-data; name="UploadFiles"; filename="document.pdf"
Content-Type: application/pdf
[binary file data]
------WebKitFormBoundary--
Example Response (202 Accepted)
{
"fileId": "a3b8c9d0-1234-5678-90ab-cdef12345678",
"status": "queued",
"message": "File upload queued for processing",
"queuedAt": "2025-11-24T10:30:00Z",
"fileName": "document.pdf",
"fileSize": 1048576
}
Response Codes
Get File Status
GET /v1/files/{file_id}
Check the current processing status of your file using its unique identifier.
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
file_id |
string (GUID) | Required | The unique identifier returned from the upload endpoint |
Request Headers
| Header | Type | Required | Description |
|---|---|---|---|
Authorization |
string | Required | API authentication token |
X-ClientId |
string | Required | Your organization identifier |
Example Request
GET /v1/files/a3b8c9d0-1234-5678-90ab-cdef12345678 HTTP/1.1
Host: hub.api.cefero.com
Authorization: cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw
X-ClientId: Your Company Name
Example Response (200 OK - Completed)
{
"fileId": "a3b8c9d0-1234-5678-90ab-cdef12345678",
"status": "completed",
"fileName": "document.pdf",
"fileSize": 1048576,
"uploadedAt": "2025-11-24T10:30:00Z",
"completedAt": "2025-11-24T10:30:45Z",
"processingTime": 45,
"results": {
"processed": true,
"metadata": {
"pages": 10,
"format": "PDF"
}
}
}
Example Response (202 Accepted - Processing)
{
"fileId": "a3b8c9d0-1234-5678-90ab-cdef12345678",
"status": "processing",
"fileName": "document.pdf",
"uploadedAt": "2025-11-24T10:30:00Z",
"message": "File is currently being processed"
}
Response Codes
Get Recent Results
GET /v1/files/results
Retrieve a list of your recently completed documents with their processing results.
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
limit |
integer | Optional | Number of recent files to retrieve (1-10, default: 10) |
Request Headers
| Header | Type | Required | Description |
|---|---|---|---|
Authorization |
string | Required | API authentication token |
X-ClientId |
string | Required | Your organization identifier |
Example Request
GET /v1/files/results?limit=5 HTTP/1.1
Host: hub.api.cefero.com
Authorization: cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw
X-ClientId: Your Company Name
Example Response (200 OK)
{
"totalResults": 5,
"limit": 5,
"files": [
{
"fileId": "a3b8c9d0-1234-5678-90ab-cdef12345678",
"status": "completed",
"fileName": "document1.pdf",
"fileSize": 1048576,
"uploadedAt": "2025-11-24T10:30:00Z",
"completedAt": "2025-11-24T10:30:45Z",
"processingTime": 45
},
{
"fileId": "b4c9d0e1-2345-6789-01bc-def123456789",
"status": "completed",
"fileName": "document2.pdf",
"fileSize": 2097152,
"uploadedAt": "2025-11-24T09:15:00Z",
"completedAt": "2025-11-24T09:16:30Z",
"processingTime": 90
}
]
}
Response Codes
.NET SDK (Cefero.ApiSdk)
The official .NET SDK for Cefero Document Extractor API. A NuGet package that simplifies integration with strongly-typed models, async/await support, and comprehensive error handling.
Why Use the SDK?
- Simplified Integration: No need to manage HTTP requests manually - the SDK handles all API communication
- Strong Typing: Full IntelliSense support with typed request/response models
- Built-in Error Handling: Comprehensive exception handling and retry logic
- Async/Await Support: Modern async patterns for non-blocking operations
- Production Ready: Battle-tested in enterprise environments
SDK Features
NuGet Package
Easy installation via NuGet Package Manager
Async/Await
Full async support for optimal performance
Type Safety
Strongly-typed models prevent runtime errors
Installation
Install the Cefero.ApiSdk package from NuGet using one of the following methods:
Package Manager Console
Install-Package Cefero.ApiSdk
.NET CLI
dotnet add package Cefero.ApiSdk
Package Reference (Manual)
Add this to your .csproj file:
<PackageReference Include="Cefero.ApiSdk" Version="1.0.4" />
- .NET 6.0 or higher
- Compatible with .NET 8.0, .NET 9.0, and later versions
- Works with ASP.NET Core, Console Apps, Windows Services, and more
Configuration
Configure the SDK by creating a CeferoClient instance with your API credentials.
Basic Configuration
using Cefero.ApiSdk;
// Initialize the client
var ceferoClient = new CeferoClient(
apiKey: "cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw",
clientKey: "Your Company Name",
baseUrl: "http://hub.api.cefero.com"
);
Configuration with appsettings.json
Recommended approach for production applications:
1. Create appsettings.json
{
"Cefero": {
"ApiKey": "cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw",
"ClientKey": "Your Company Name",
"BaseUrl": "http://hub.api.cefero.com"
}
}
2. Create Configuration Classes
public class CeferoSettings
{
public string ApiKey { get; set; } = string.Empty;
public string ClientKey { get; set; } = string.Empty;
public string BaseUrl { get; set; } = "http://hub.api.cefero.com";
}
public class AppConfiguration
{
public CeferoSettings Cefero { get; set; } = new();
public static AppConfiguration Load()
{
var configuration = new ConfigurationBuilder()
.SetBasePath(AppContext.BaseDirectory)
.AddJsonFile("appsettings.json", optional: false, reloadOnChange: true)
.AddEnvironmentVariables()
.Build();
var config = new AppConfiguration();
configuration.Bind(config);
return config;
}
}
3. Initialize Client from Configuration
var config = AppConfiguration.Load();
var ceferoClient = new CeferoClient(
config.Cefero.ApiKey,
config.Cefero.ClientKey,
config.Cefero.BaseUrl
);
For production environments, store sensitive credentials in environment variables or Azure Key Vault instead of appsettings.json. The SDK supports environment variable overrides automatically.
Dependency Injection Setup (ASP.NET Core)
// In Program.cs or Startup.cs
builder.Services.Configure<CeferoSettings>(
builder.Configuration.GetSection("Cefero"));
builder.Services.AddSingleton<ICeferoClient>(sp =>
{
var settings = sp.GetRequiredService<IOptions<CeferoSettings>>().Value;
return new CeferoClient(
settings.ApiKey,
settings.ClientKey,
settings.BaseUrl
);
});
SDK Usage Examples
The SDK provides three main operations: uploading files, checking file status, and retrieving all results.
Quick Start Example
The simplest way to get started with the SDK:
using Cefero.Sdk;
var options = new CeferoClientOptions
{
BaseUri = new Uri("https://your-cefero-host/"), // base_url
ApiKey = "<api-key>",
ClientKey = "<client-key>",
UploadRelativeUrl = "v1/files" // matches POST {{base_url}}/v1/files
};
using var httpClient = new HttpClient { BaseAddress = options.BaseUri };
ICeferoClient ceferoClient = new CeferoClient(httpClient, options);
var result = await ceferoClient.UploadAsync(
filePath: "Doorways 2 (2).pdf",
task: CeferoTaskType.Invoice);
Console.WriteLine($"TaskId: {result.TaskId}, Status: {result.Status}");
1. Upload a File
Upload a document for processing and receive a tracking ID:
using Cefero.ApiSdk;
var ceferoClient = new CeferoClient(
"cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw",
"Your Company Name",
"http://hub.api.cefero.com"
);
// Upload a single file
var filePath = @"C:\Documents\invoice.pdf";
var result = await ceferoClient.UploadFileAsync(filePath);
Console.WriteLine($"File ID: {result.FileId}");
Console.WriteLine($"Status: {result.Status}");
Console.WriteLine($"Message: {result.Message}");
Upload Multiple Files
var files = new[]
{
@"C:\Documents\invoice1.pdf",
@"C:\Documents\invoice2.pdf",
@"C:\Documents\receipt.jpg"
};
foreach (var file in files)
{
try
{
var result = await ceferoClient.UploadFileAsync(file);
Console.WriteLine($"✓ {Path.GetFileName(file)} - ID: {result.FileId}");
}
catch (Exception ex)
{
Console.WriteLine($"✗ {Path.GetFileName(file)} - Error: {ex.Message}");
}
}
2. Check File Status
Query the processing status and retrieve results once completed:
var fileId = "a3b8c9d0-1234-5678-90ab-cdef12345678";
var result = await ceferoClient.GetFileStatusAsync(fileId);
if (!result.Success)
{
Console.WriteLine($"Error: {result.Message}");
return;
}
foreach (var file in result.Data)
{
Console.WriteLine($"File: {file.FileName}");
Console.WriteLine($"Status: {file.Status}");
Console.WriteLine($"Queued At: {file.QueuedAt}");
// Access OCR results if processing completed
if (file.Result?.OcrResult != null)
{
var ocr = file.Result.OcrResult;
// Display header fields
if (ocr.Header?.Fields != null)
{
Console.WriteLine("\nHeader Information:");
foreach (var field in ocr.Header.Fields)
{
Console.WriteLine($" {field.Key}: {field.Value}");
}
}
// Display page data
if (ocr.Pages != null)
{
Console.WriteLine($"\nTotal Pages: {ocr.Pages.Count}");
foreach (var page in ocr.Pages)
{
Console.WriteLine($"\nPage {page.Page}:");
if (page.Lines != null)
{
foreach (var line in page.Lines)
{
if (line.Fields != null)
{
foreach (var field in line.Fields)
{
Console.WriteLine($" {field.Key}: {field.Value}");
}
}
}
}
}
}
}
}
Polling Pattern for Completion
public async Task<FileData> WaitForCompletionAsync(
string fileId,
int maxWaitSeconds = 300,
CancellationToken cancellationToken = default)
{
var elapsed = 0;
var pollInterval = 2; // seconds
while (elapsed < maxWaitSeconds)
{
var result = await ceferoClient.GetFileStatusAsync(fileId, cancellationToken);
if (result.Success && result.Data.Count > 0)
{
var file = result.Data[0];
if (file.Status == "completed")
return file;
if (file.Status == "failed")
throw new Exception($"Processing failed: {file.Error}");
}
await Task.Delay(pollInterval * 1000, cancellationToken);
elapsed += pollInterval;
}
throw new TimeoutException($"Processing timeout after {maxWaitSeconds} seconds");
}
// Usage
var uploadResult = await ceferoClient.UploadFileAsync("invoice.pdf");
var completedFile = await WaitForCompletionAsync(uploadResult.FileId);
Console.WriteLine($"Processing completed in {completedFile.Result.UploadedAt}");
3. Get All Results
Retrieve a list of recently processed files:
// Get all results (default: last 10 files)
var results = await ceferoClient.GetAllResultsAsync();
Console.WriteLine($"Total Files: {results.Data.Count}");
foreach (var file in results.Data)
{
Console.WriteLine($"\nFile: {file.FileName}");
Console.WriteLine($" ID: {file.FileId}");
Console.WriteLine($" Status: {file.Status}");
Console.WriteLine($" Size: {file.FileSize:N0} bytes");
Console.WriteLine($" Uploaded: {file.QueuedAt}");
if (file.Result?.OcrResult != null)
{
var ocr = file.Result.OcrResult;
Console.WriteLine($" Pages: {ocr.Pages?.Count ?? 0}");
Console.WriteLine($" Header Fields: {ocr.Header?.Fields?.Count ?? 0}");
}
}
With Custom Page Size
// Get last 5 results
var results = await ceferoClient.GetAllResultsAsync(pageSize: 5);
// Get last 20 results
var moreResults = await ceferoClient.GetAllResultsAsync(pageSize: 20);
Complete Application Example
Full console application demonstrating all SDK features:
using Cefero.ApiSdk;
class Program
{
static async Task Main(string[] args)
{
var client = new CeferoClient(
"cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw",
"Your Company Name",
"http://hub.api.cefero.com"
);
try
{
// 1. Upload a file
Console.WriteLine("Uploading file...");
var uploadResult = await client.UploadFileAsync("invoice.pdf");
Console.WriteLine($"✓ File uploaded - ID: {uploadResult.FileId}");
// 2. Wait for processing
Console.WriteLine("Waiting for processing...");
var fileData = await WaitForCompletion(client, uploadResult.FileId);
Console.WriteLine("✓ Processing completed");
// 3. Display results
if (fileData.Result?.OcrResult != null)
{
var ocr = fileData.Result.OcrResult;
Console.WriteLine($"\nExtracted Data:");
Console.WriteLine($" Pages: {ocr.Pages?.Count ?? 0}");
if (ocr.Header?.Fields != null)
{
Console.WriteLine(" Header:");
foreach (var field in ocr.Header.Fields)
Console.WriteLine($" {field.Key}: {field.Value}");
}
}
// 4. Get all recent results
Console.WriteLine("\nFetching recent results...");
var allResults = await client.GetAllResultsAsync(pageSize: 5);
Console.WriteLine($"✓ Found {allResults.Data.Count} recent files");
}
catch (Exception ex)
{
Console.WriteLine($"✗ Error: {ex.Message}");
}
}
static async Task<FileData> WaitForCompletion(
ICeferoClient client,
string fileId)
{
for (int i = 0; i < 60; i++) // Max 2 minutes
{
var result = await client.GetFileStatusAsync(fileId);
if (result.Success && result.Data.Count > 0)
{
var file = result.Data[0];
if (file.Status == "completed") return file;
if (file.Status == "failed")
throw new Exception($"Failed: {file.Error}");
}
await Task.Delay(2000); // Wait 2 seconds
}
throw new TimeoutException("Processing timeout");
}
}
Error Handling
The SDK provides comprehensive error handling:
try
{
var result = await ceferoClient.UploadFileAsync(filePath);
if (!result.Success)
{
Console.WriteLine($"Upload failed: {result.Message}");
return;
}
Console.WriteLine($"Success: {result.FileId}");
}
catch (FileNotFoundException ex)
{
Console.WriteLine($"File not found: {ex.Message}");
}
catch (HttpRequestException ex)
{
Console.WriteLine($"Network error: {ex.Message}");
}
catch (UnauthorizedAccessException ex)
{
Console.WriteLine($"Authentication failed: {ex.Message}");
}
catch (Exception ex)
{
Console.WriteLine($"Unexpected error: {ex.Message}");
}
Complete working examples are available in the SDK repository:
- Api Usage test: Demonstrates file upload, status checking, and results retrieval
- Api Get all result: Shows how to retrieve and display all processed results
- Both projects include configuration management, logging, and error handling
ICeferoClient Interface
The SDK exposes the following interface for easy testing and dependency injection:
public interface ICeferoClient
{
Task<UploadResponse> UploadFileAsync(
string filePath,
CancellationToken cancellationToken = default);
Task<ApiResponse<List<FileData>>> GetFileStatusAsync(
string fileId,
CancellationToken cancellationToken = default);
Task<ApiResponse<List<FileData>>> GetAllResultsAsync(
int? pageSize = null,
CancellationToken cancellationToken = default);
}
ERP Integration Guide
Learn how to seamlessly integrate the Document Extractor API into your ERP system for automated document processing and data extraction.
If you're building a .NET-based ERP integration, we strongly recommend using the Cefero.ApiSdk NuGet package instead of implementing HTTP calls manually. The SDK provides type safety, error handling, and simplified configuration.
Integration Overview
The API is designed for easy integration with any ERP system. The asynchronous processing model allows your ERP to submit documents and continue operations without blocking.
Authentication Setup
Every API request must include authentication headers. These credentials identify your organization and authorize access to the Document Extractor service.
The following headers must be included in every API call:
| Header Name | Value | Description |
|---|---|---|
Authorization |
cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw |
Production API authentication token |
X-ClientId |
<Your Company Name> |
Your organization identifier (replace with your actual company name) |
Implementation Example
Here's how to configure these headers in different platforms:
// .NET Example
var client = new HttpClient();
client.DefaultRequestHeaders.Add("Authorization", "cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw");
client.DefaultRequestHeaders.Add("X-ClientId", "Your Company Name");
// Angular/TypeScript Example
const headers = new HttpHeaders({
'Authorization': 'cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw',
'X-ClientId': 'Your Company Name'
});
// Python Example
headers = {
'Authorization': 'cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw',
'X-ClientId': 'Your Company Name'
}
- Replace
<Your Company Name>with your actual organization name - Store the Authorization token securely (use environment variables or secret management)
- These headers are required for all endpoints: upload, status check, and results retrieval
- Missing or invalid headers will result in a 401 Unauthorized response
- Don't have API access? Request your API credentials here
Recommended Workflow
- Document Capture - Monitor incoming documents (invoices, purchase orders, receipts) through your ERP's document upload module
- API Upload - Submit documents to the API endpoint and receive tracking IDs immediately
- Status Tracking - Implement polling or webhook handlers to monitor processing status
- Data Extraction - Retrieve structured data once processing completes
- ERP Mapping - Map extracted fields to your ERP's data model (GL accounts, vendors, line items)
- Validation - Implement business rule validation before committing to your ERP database
- Auto-Posting - Create transactions automatically or queue for human approval
Integration Patterns
Pattern 1: Real-Time Processing
Upload documents as soon as they arrive in your ERP and poll for results. Suitable for low-volume scenarios.
Pattern 2: Batch Processing
Queue documents throughout the day and process them in scheduled batches (e.g., hourly or end-of-day). Ideal for high-volume operations.
Pattern 3: Event-Driven Integration
Use background workers or message queues to handle uploads and status checks asynchronously. Recommended for enterprise deployments.
Data Mapping Considerations
Map the API's extracted data to your ERP's data structures:
- Vendor Information: Match extracted vendor names/IDs to your vendor master file
- GL Accounts: Apply business logic to assign proper general ledger account codes
- Cost Centers: Route documents to appropriate departments or cost centers
- Line Items: Parse and validate individual transaction lines
- Tax Handling: Map extracted tax amounts to your tax codes and jurisdictions
- Document Metadata: Store file IDs, processing dates, and audit trails
Integration Examples
.NET ERP Integration (Using SDK)
Background service for processing documents in your .NET-based ERP using the Cefero SDK:
using Cefero.ApiSdk;
// Document processing service using Cefero SDK
public class DocumentProcessingService : BackgroundService
{
private readonly ICeferoClient _ceferoClient;
private readonly IErpRepository _erpRepo;
private readonly ILogger<DocumentProcessingService> _logger;
public DocumentProcessingService(
ICeferoClient ceferoClient,
IErpRepository erpRepo,
ILogger<DocumentProcessingService> logger)
{
_ceferoClient = ceferoClient;
_erpRepo = erpRepo;
_logger = logger;
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
var pendingDocs = await _erpRepo.GetPendingDocumentsAsync();
foreach (var doc in pendingDocs)
{
try
{
// Upload to API using SDK
var uploadResult = await _ceferoClient.UploadFileAsync(
doc.FilePath, stoppingToken);
await _erpRepo.UpdateFileIdAsync(doc.Id, uploadResult.FileId);
// Poll for completion and map to ERP
var result = await WaitForCompletionAsync(
uploadResult.FileId, stoppingToken);
await MapAndSaveToErpAsync(result);
_logger.LogInformation(
"Successfully processed document {FileName}", doc.FileName);
}
catch (Exception ex)
{
_logger.LogError(ex,
"Failed to process document {FileName}", doc.FileName);
await _erpRepo.MarkAsFailedAsync(doc.Id, ex.Message);
}
}
await Task.Delay(TimeSpan.FromMinutes(5), stoppingToken);
}
}
private async Task<FileData> WaitForCompletionAsync(
string fileId,
CancellationToken cancellationToken)
{
for (int i = 0; i < 150; i++) // Max 5 minutes
{
var result = await _ceferoClient.GetFileStatusAsync(
fileId, cancellationToken);
if (result.Success && result.Data.Count > 0)
{
var file = result.Data[0];
if (file.Status == "completed") return file;
if (file.Status == "failed")
throw new Exception($"Processing failed: {file.Error}");
}
await Task.Delay(2000, cancellationToken);
}
throw new TimeoutException("Processing timeout");
}
private async Task MapAndSaveToErpAsync(FileData fileData)
{
var ocrResult = fileData.Result?.OcrResult;
if (ocrResult == null) return;
// Map OCR results to ERP data model
var erpDocument = new ErpDocument
{
OriginalFileId = fileData.FileId,
FileName = fileData.FileName,
ProcessedDate = DateTime.UtcNow
};
// Extract header information
if (ocrResult.Header?.Fields != null)
{
foreach (var field in ocrResult.Header.Fields)
{
// Map to ERP fields based on your business logic
switch (field.Key.ToLower())
{
case "vendor":
erpDocument.VendorName = field.Value;
break;
case "invoice_number":
erpDocument.InvoiceNumber = field.Value;
break;
case "total_amount":
erpDocument.TotalAmount = decimal.Parse(field.Value);
break;
}
}
}
// Save to ERP database
await _erpRepo.SaveDocumentAsync(erpDocument);
}
}
// Configure in Program.cs or Startup.cs
builder.Services.AddSingleton<ICeferoClient>(sp =>
{
var config = sp.GetRequiredService<IConfiguration>();
return new CeferoClient(
config["Cefero:ApiKey"],
config["Cefero:ClientKey"],
config["Cefero:BaseUrl"]
);
});
builder.Services.AddHostedService<DocumentProcessingService>();
Angular ERP Dashboard
Upload component for web-based ERP interface:
// Service with authentication
@Injectable({ providedIn: 'root' })
export class DocumentExtractorService {
private baseUrl = 'http://hub.api.cefero.com';
private headers = new HttpHeaders({
'Authorization': 'cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw',
'X-ClientId': 'Your Company Name'
});
constructor(private http: HttpClient) {}
uploadFile(file: File): Observable {
const formData = new FormData();
formData.append('UploadFiles', file);
return this.http.post(`${this.baseUrl}/v1/files`, formData,
{ headers: this.headers });
}
getStatus(fileId: string): Observable {
return this.http.get(`${this.baseUrl}/v1/files/${fileId}`,
{ headers: this.headers });
}
}
// Component usage
export class ErpDocumentUploadComponent {
constructor(
private extractorService: DocumentExtractorService,
private erpService: ErpDataService
) {}
processDocument(file: File): void {
this.extractorService.uploadFile(file).pipe(
switchMap(response => this.pollUntilComplete(response.fileId)),
switchMap(result => this.erpService.createTransaction(result))
).subscribe({
next: (transaction) => this.notifySuccess(transaction),
error: (error) => this.handleError(error)
});
}
private pollUntilComplete(fileId: string): Observable {
return interval(2000).pipe(
switchMap(() => this.extractorService.getStatus(fileId)),
filter(status => status.status === 'completed'),
take(1)
);
}
}
Python ERP Integration
Batch processor for Python-based ERP systems:
import requests
import time
class ErpDocumentProcessor:
def __init__(self, erp_db):
self.base_url = "http://hub.api.cefero.com"
# Required authentication headers
self.headers = {
'Authorization': 'cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw',
'X-ClientId': 'Your Company Name'
}
self.db = erp_db
def upload_file(self, file_path):
"""Upload a file to the API"""
with open(file_path, 'rb') as f:
files = {'UploadFiles': f}
response = requests.post(f'{self.base_url}/v1/files',
headers=self.headers, files=files)
return response.json()
def get_status(self, file_id):
"""Check file processing status"""
response = requests.get(f'{self.base_url}/v1/files/{file_id}',
headers=self.headers)
return response.json()
def process_batch(self, document_paths):
"""Process a batch of documents and post to ERP"""
for path in document_paths:
# Upload to API
response = self.upload_file(path)
file_id = response['fileId']
# Wait for processing
result = self.wait_for_completion(file_id)
# Map and validate
erp_data = self.map_to_erp_format(result)
if self.validate_business_rules(erp_data):
self.db.create_invoice(erp_data)
else:
self.db.queue_for_review(erp_data)
def wait_for_completion(self, file_id, max_wait=300):
"""Poll until file processing completes"""
elapsed = 0
while elapsed < max_wait:
status = self.get_status(file_id)
if status['status'] == 'completed':
return status
time.sleep(2)
elapsed += 2
raise TimeoutError(f"Processing timeout after {max_wait}s")
def map_to_erp_format(self, api_result):
"""Map API response to ERP data model"""
return {
'vendor_id': self.lookup_vendor(api_result),
'gl_account': self.determine_gl_account(api_result),
'amount': api_result['results']['total_amount'],
'line_items': self.parse_line_items(api_result)
}
Best Practices for ERP Integration
- Implement idempotent operations to handle retries safely
- Store file IDs in your ERP database for audit trails
- Use transaction boundaries when posting to your ERP
- Implement error queues for failed documents
- Log all API interactions for troubleshooting
- Configure appropriate timeout values based on document size
- Consider implementing a manual review queue for low-confidence extractions
- Use environment-specific API keys (dev, staging, production)
Security Considerations
- Store API credentials in secure configuration management systems (Azure Key Vault, AWS Secrets Manager, etc.)
- Use separate API keys for different environments
- Implement role-based access control for document processing functions
- Encrypt sensitive data at rest and in transit
- Maintain audit logs of all document processing activities
Performance Optimization
- Process documents asynchronously using background workers or queues
- Implement connection pooling for HTTP clients
- Use exponential backoff for status polling
- Cache vendor lookups and GL account mappings
- Monitor API usage and implement rate limiting on your side
Code Examples
Quick-start code snippets for integrating the API. Each example demonstrates file upload and status checking with minimal code.
The examples below show manual HTTP implementations for various languages. If you're using .NET, we strongly recommend using the Cefero.ApiSdk NuGet package instead, which provides a much simpler and type-safe API.
.NET (C#) - Using SDK (Recommended)
Simplest approach for .NET applications:
// Install: dotnet add package Cefero.ApiSdk
using Cefero.ApiSdk;
var client = new CeferoClient(
"cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw",
"Your Company Name",
"http://hub.api.cefero.com"
);
// Upload file
var uploadResult = await client.UploadFileAsync("invoice.pdf");
Console.WriteLine($"File ID: {uploadResult.FileId}");
// Get status
var statusResult = await client.GetFileStatusAsync(uploadResult.FileId);
var file = statusResult.Data[0];
Console.WriteLine($"Status: {file.Status}");
// Get OCR results when completed
if (file.Result?.OcrResult != null)
{
foreach (var field in file.Result.OcrResult.Header.Fields)
Console.WriteLine($"{field.Key}: {field.Value}");
}
.NET (C#) - Manual HTTP Implementation
If you prefer not to use the SDK, here's the manual HTTP approach:
public class DocumentExtractorClient
{
private readonly HttpClient _http;
private const string BaseUrl = "http://hub.api.cefero.com";
public DocumentExtractorClient()
{
_http = new HttpClient();
_http.DefaultRequestHeaders.Add("Authorization",
"cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw");
_http.DefaultRequestHeaders.Add("X-ClientId", "Your Company Name");
}
public async Task UploadAsync(string filePath)
{
using var content = new MultipartFormDataContent();
using var file = File.OpenRead(filePath);
content.Add(new StreamContent(file), "UploadFiles", Path.GetFileName(filePath));
var response = await _http.PostAsync($"{BaseUrl}/v1/files", content);
var json = await response.Content.ReadAsStringAsync();
return JObject.Parse(json)["fileId"].ToString();
}
public async Task GetStatusAsync(string fileId)
{
var response = await _http.GetAsync($"{BaseUrl}/v1/files/{fileId}");
return JObject.Parse(await response.Content.ReadAsStringAsync());
}
}
// Usage
var client = new DocumentExtractorClient();
var fileId = await client.UploadAsync("invoice.pdf");
var status = await client.GetStatusAsync(fileId);
Angular (TypeScript)
Service with RxJS for reactive programming:
@Injectable({ providedIn: 'root' })
export class DocumentExtractorService {
private baseUrl = 'http://hub.api.cefero.com';
private headers = new HttpHeaders({
'Authorization': 'cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw',
'X-ClientId': 'Your Company Name'
});
constructor(private http: HttpClient) {}
uploadFile(file: File): Observable {
const formData = new FormData();
formData.append('UploadFiles', file);
return this.http.post(`${this.baseUrl}/v1/files`, formData,
{ headers: this.headers });
}
getStatus(fileId: string): Observable {
return this.http.get(`${this.baseUrl}/v1/files/${fileId}`,
{ headers: this.headers });
}
}
// Component usage
this.service.uploadFile(file).pipe(
switchMap(res => this.pollStatus(res.fileId))
).subscribe(result => console.log(result));
Python
Clean implementation using requests library:
import requests
class DocumentExtractorClient:
def __init__(self):
self.base_url = "http://hub.api.cefero.com"
self.headers = {
'Authorization': 'cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw',
'X-ClientId': 'Your Company Name'
}
def upload_file(self, file_path):
with open(file_path, 'rb') as f:
files = {'UploadFiles': f}
response = requests.post(f'{self.base_url}/v1/files',
headers=self.headers, files=files)
return response.json()
def get_status(self, file_id):
response = requests.get(f'{self.base_url}/v1/files/{file_id}',
headers=self.headers)
return response.json()
# Usage
client = DocumentExtractorClient()
result = client.upload_file("invoice.pdf")
status = client.get_status(result['fileId'])
Error Responses
Authentication Error (401)
{
"error": "Unauthorized",
"message": "Invalid or missing authentication token",
"statusCode": 401
}
Validation Error (400)
{
"error": "Bad Request",
"message": "File is required",
"statusCode": 400
}
Processing Error (500)
{
"error": "Internal Server Error",
"message": "File processing failed",
"fileId": "a3b8c9d0-1234-5678-90ab-cdef12345678",
"statusCode": 500
}
Best Practices
- Status Polling: Check file status every 2-3 seconds. Increase intervals for long-running processes.
- File Size: Split files larger than 100MB for better performance.
- Error Handling: Implement retry logic for network failures.
- Timeouts: Set upload timeouts based on file size (allow 1 minute per 10MB).
- Rate Limits: Respect API limits to maintain service quality.
Store API keys in environment variables or secure vaults. Never commit keys to version control.