diff --git a/Observability.md b/Observability.md new file mode 100644 index 0000000..ee7160a --- /dev/null +++ b/Observability.md @@ -0,0 +1,65 @@ +# Observability Strategy: Custom Metrics vs. OpenTelemetry + +This document summarizes the discussion on observability strategies for the Dynaman project, covering the trade-offs between using direct custom metrics and adopting the OpenTelemetry standard. + +## Initial Question: Custom Metrics vs. OpenTelemetry + +The primary question was to understand the difference between implementing custom metrics manually versus using OpenTelemetry with AWS CloudWatch, especially concerning cost and implementation effort. + +### High-Level Comparison + +| Feature | Custom Metrics (e.g., using `boto3`) | CloudWatch with OpenTelemetry | +| :--- | :--- | :--- | +| **Concept** | **Direct API Interaction.** Manually construct and send metric data directly to the CloudWatch API. | **Standardized Instrumentation.** Use a vendor-neutral API (OTel) in the app, which then sends data to a backend via a configurable "exporter". | +| **Implementation** | Entirely manual. Requires writing specific code for every single metric. | **Automatic & Manual.** Provides auto-instrumentation for frameworks (FastAPI) to capture standard signals (latency, errors) with minimal setup. | +| **Flexibility** | **Low (Vendor Lock-in).** Code is tightly coupled to the AWS `boto3` SDK. Switching providers requires a complete rewrite of monitoring code. | **High (Vendor-Neutral).** Application code is decoupled from the backend. Switching from CloudWatch to another provider is a configuration change. | +| **Cost** | **Direct Cost:** Standard CloudWatch ingestion fees. **Indirect Cost:** High development and maintenance time. | **Direct Cost:** Ingestion fees are the same. **Indirect Cost:** Lower long-term development cost. A small compute cost exists if using the Collector. | +| **Features**| **Metrics only.** | **Metrics, Traces, and Logs.** OTel is designed to handle all three pillars of observability, allowing for rich, correlated data. | + +### Does OpenTelemetry create more metrics? + +Yes, out-of-the-box, OTel's auto-instrumentation captures a comprehensive set of standard metrics (e.g., latency histograms for every API endpoint), which is more than you would create manually. + +However, this is a feature. It provides a rich, detailed view of system health from the start. Crucially, **you have full control to manage cost** by configuring **Views** in the SDK or **Processors** in the Collector to filter, aggregate, or drop metrics before they are sent to CloudWatch. + +## The Role of the OpenTelemetry Collector + +The next question was about the components needed on the AWS side and the concept of a "sidecar container". + +There are two main patterns to get OTel data to CloudWatch: + +### Path 1: Direct Export +The application uses an AWS-specific exporter within the OTel SDK to send data directly to CloudWatch APIs. + +**Flow:** +`[Your Python App + OTel SDK + AWS Exporter] ----(HTTPS)----> [AWS CloudWatch API]` + +### Path 2: Collector Sidecar (Recommended) +The application sends its data to an OTel Collector running as a **sidecar container** alongside the application container in the same ECS Task. The Collector then forwards the data to CloudWatch. + +**Flow:** +`[Your App (OTel SDK)] --(localhost)--> [OTel Collector (Sidecar)] --(HTTPS)--> [AWS CloudWatch API]` + +**Associated Costs:** +* **CloudWatch Ingestion Cost:** No change. This is the same in both patterns. +* **Compute Cost:** This pattern introduces a small, additional compute cost because the sidecar container requires its own CPU and memory allocation in the ECS task. + +The benefits of the collector (improved performance, reliability, centralized configuration) generally outweigh its small compute cost. + +## Cluster Capacity Analysis + +A request was made to analyze the existing Terraform code to determine if the ECS cluster could handle the additional resource load of an OTel Collector sidecar. + +### Analysis Summary + +1. **ECS Node:** The cluster runs on a single `t4g.small` instance, which provides **2048 CPU units** and **2048 MiB of memory**. +2. **Current Usage:** The 6 running tasks reserve a total of **1536 CPU units (75%)** and **1536 MiB of memory (75%)**. +3. **Sidecar Impact:** Adding a sidecar with an estimated **128 CPU units** and **128 MiB memory** to the 4 backend tasks results in a new total reservation. + * **New Total CPU:** (2 tasks × 256) + (4 tasks × 384) = **2048 CPU units** + * **New Total Memory:** (2 tasks × 256) + (4 tasks × 384) = **2048 MiB** + +### Conclusion + +**Yes, the `t4g.small` node is technically sufficient, but it will be at 100% resource reservation.** This is risky and leaves no buffer for the OS, the ECS agent, or deployment activities. + +**Recommendation:** To safely run with the OTel Collector sidecar, the instance type should be upgraded to a **`t4g.medium`** to provide a healthy resource buffer. diff --git a/auth-service/main.py b/auth-service/main.py index eb964ba..89d9fa9 100644 --- a/auth-service/main.py +++ b/auth-service/main.py @@ -6,6 +6,7 @@ from api.dependencies import get_user_repository, get_db from domain.entities.user import User, UserRole from domain.services.security_service import SecurityService +from opentelemetry_config import setup_opentelemetry @asynccontextmanager async def lifespan(app: FastAPI): @@ -33,6 +34,9 @@ async def lifespan(app: FastAPI): app = FastAPI(title="Dynaman Auth Service", lifespan=lifespan) +# Setup OpenTelemetry +setup_opentelemetry(app) + # CORS Configuration origins = [ "http://localhost:5173", # Vite default @@ -48,6 +52,23 @@ async def lifespan(app: FastAPI): allow_headers=["*"], ) +# New Relic UI Config Endpoint +import os +from pydantic import BaseModel + +class UiTelemetryConfig(BaseModel): + new_relic_browser_ingest_key: str + new_relic_browser_app_id: str + environment: str + +@app.get("/api/v1/config/ui", response_model=UiTelemetryConfig) +async def ui_config(): + return UiTelemetryConfig( + new_relic_browser_ingest_key=os.environ.get("NEW_RELIC_BROWSER_INGEST_KEY", ""), + new_relic_browser_app_id=os.environ.get("NEW_RELIC_BROWSER_APP_ID", ""), + environment=os.environ.get("APP_ENVIRONMENT", "unknown"), + ) + @app.get("/health") async def health_check(): return {"status": "ok"} diff --git a/auth-service/opentelemetry_config.py b/auth-service/opentelemetry_config.py new file mode 100644 index 0000000..60303ec --- /dev/null +++ b/auth-service/opentelemetry_config.py @@ -0,0 +1,87 @@ +import os +from opentelemetry import trace, metrics +from opentelemetry.sdk.trace import TracerProvider +from opentelemetry.sdk.trace.export import BatchSpanProcessor +from opentelemetry.sdk.resources import Resource +from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter +from opentelemetry.sdk.metrics import MeterProvider +from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader +from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter + +from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor +from opentelemetry.instrumentation.pymongo import PymongoInstrumentor +from opentelemetry.instrumentation.requests import RequestsInstrumentor + + +def setup_opentelemetry(app): + """Configure OpenTelemetry for the application.""" + + if os.environ.get("OTEL_ENABLED") != "true": + print("OpenTelemetry is disabled.") + return + + print("OpenTelemetry is enabled. Initializing...") + # Get the service name from an environment variable, default to 'auth-service' + service_name = os.environ.get("OTEL_SERVICE_NAME", "auth-service") + deployment_environment = os.environ.get("APP_ENVIRONMENT", "unknown") + + # Set up a resource with the service name + resource = Resource(attributes={ + "service.name": service_name, + "deployment.environment": deployment_environment, + }) + + # --- TRACES SETUP --- + # Set up a TracerProvider + tracer_provider = TracerProvider(resource=resource) + trace.set_tracer_provider(tracer_provider) + + # Configure the OTLP exporter to send data to the collector sidecar + # The endpoint is the default for the OTel Collector's gRPC port + otlp_exporter = OTLPSpanExporter( + endpoint=os.environ.get("OTEL_EXPORTER_OTLP_ENDPOINT", "http://localhost:4317"), + insecure=True # Use insecure connection for localhost communication + ) + + # Use a BatchSpanProcessor to send spans in batches + span_processor = BatchSpanProcessor(otlp_exporter) + tracer_provider.add_span_processor(span_processor) + # --------------------- + + # --- METRICS SETUP --- + # Configure the Metric Exporter (pointing to Collector gRPC port) + metric_exporter = OTLPMetricExporter( + endpoint=os.environ.get("OTEL_EXPORTER_OTLP_ENDPOINT", "http://localhost:4317"), + insecure=True + ) + + # Metrics are sent periodically (default is every 60s) + reader = PeriodicExportingMetricReader(metric_exporter) + + # Set up the MeterProvider with the same resource tags + meter_provider = MeterProvider(resource=resource, metric_readers=[reader]) + metrics.set_meter_provider(meter_provider) + # --------------------- + + # Instrument FastAPI + FastAPIInstrumentor.instrument_app(app, tracer_provider=tracer_provider) + + # Instrument Pymongo for MongoDB queries (which covers motor) + def request_hook(span, event): + if span and span.is_recording(): + # Manually ensure New Relic sees the database name + # 'event.database_name' is provided by the pymongo monitoring API + span.set_attribute("db.name", event.database_name) + span.set_attribute("db.namespace", event.database_name) # Newer standard + + # Optional: capture the collection name if missing + if hasattr(event, 'command') and event.command: + collection = event.command.get(event.command_name) + if isinstance(collection, str): + span.set_attribute("db.collection.name", collection) + + # Apply the instrumentation with the hook + PymongoInstrumentor().instrument(tracer_provider=tracer_provider, request_hook=request_hook) + + # Instrument the requests library for any outgoing HTTP calls + RequestsInstrumentor().instrument(tracer_provider=tracer_provider) diff --git a/auth-service/requirements.txt b/auth-service/requirements.txt index a39ae7d..d5fab51 100644 --- a/auth-service/requirements.txt +++ b/auth-service/requirements.txt @@ -14,3 +14,11 @@ pytest-cov flake8 mongomock mongomock-motor + +# OpenTelemetry +opentelemetry-api +opentelemetry-sdk +opentelemetry-exporter-otlp +opentelemetry-instrumentation-fastapi +opentelemetry-instrumentation-requests +opentelemetry-instrumentation-pymongo diff --git a/docker-compose.yml b/docker-compose.yml index cb095e3..be34699 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -24,6 +24,22 @@ services: depends_on: - mongodb + otel-collector: + # The user-requested version 0.143.1 seems incorrect, but is required for the connector configuration. Please double-check the version number if issues persist. + image: otel/opentelemetry-collector-contrib:0.143.1 + container_name: dyna_otel_collector + restart: always + command: ["--config=/etc/otel-collector-local-config.yaml"] + volumes: + - ./otel-collector-local-config.yaml:/etc/otel-collector-local-config.yaml + ports: + - "4317:4317" # OTLP gRPC receiver + - "4318:4318" # OTLP HTTP receiver + environment: + # The NEW_RELIC_LICENSE_KEY must be set in your environment (e.g., in a .env file) + - NEW_RELIC_LICENSE_KEY=${NEW_RELIC_LICENSE_KEY:?Please set NEW_RELIC_LICENSE_KEY in your environment} + - NEW_RELIC_OTLP_ENDPOINT=${NEW_RELIC_OTLP_ENDPOINT:-https://otlp.nr-data.net:443} + engine-metadata: build: ./engine container_name: dyna_engine_metadata @@ -31,8 +47,15 @@ services: - APP_MODE=metadata - MONGODB_URL=mongodb://mongodb:27017 - DATABASE_NAME=dynaman + - OTEL_ENABLED=true + - OTEL_SERVICE_NAME=engine-meta + - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317 + - OTEL_EXPORTER_OTLP_PROTOCOL=grpc + - OTEL_EXPORTER_OTLP_INSECURE=true + - APP_ENVIRONMENT=development depends_on: - mongodb + - otel-collector restart: always engine-execution: @@ -42,8 +65,15 @@ services: - APP_MODE=execution - MONGODB_URL=mongodb://mongodb:27017 - DATABASE_NAME=dynaman + - OTEL_ENABLED=true + - OTEL_SERVICE_NAME=engine-exec + - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317 + - OTEL_EXPORTER_OTLP_PROTOCOL=grpc + - OTEL_EXPORTER_OTLP_INSECURE=true + - APP_ENVIRONMENT=development depends_on: - mongodb + - otel-collector restart: always auth-service: @@ -53,8 +83,17 @@ services: - MONGODB_URL=mongodb://mongodb:27017 - DATABASE_NAME=dynaman_auth - SECRET_KEY=09d25e094faa6ca2556c818166b7a9563b93f7099f6f0f4caa6cf63b88e8d3e7 + - OTEL_ENABLED=true + - OTEL_SERVICE_NAME=auth-service + - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317 + - OTEL_EXPORTER_OTLP_PROTOCOL=grpc + - OTEL_EXPORTER_OTLP_INSECURE=true + - APP_ENVIRONMENT=development + - NEW_RELIC_BROWSER_INGEST_KEY=${NEW_RELIC_BROWSER_INGEST_KEY} + - NEW_RELIC_BROWSER_APP_ID=${NEW_RELIC_BROWSER_APP_ID} depends_on: - mongodb + - otel-collector restart: always api-gateway: diff --git a/dynaman-ui/nginx.conf b/dynaman-ui/nginx.conf index 1b65ce1..2f30560 100644 --- a/dynaman-ui/nginx.conf +++ b/dynaman-ui/nginx.conf @@ -13,4 +13,11 @@ server { # location /api/ { # proxy_pass http://engine:8000/; # } + + # Expose Nginx stub status metrics + location /nginx_status { + stub_status on; + access_log off; + allow 127.0.0.1; # Allow access only from localhost (or specific IPs) + } } diff --git a/dynaman-ui/src/components/DynamicForm.tsx b/dynaman-ui/src/components/DynamicForm.tsx index 76be6c6..2a013f3 100644 --- a/dynaman-ui/src/components/DynamicForm.tsx +++ b/dynaman-ui/src/components/DynamicForm.tsx @@ -26,34 +26,58 @@ interface LayoutItem { fieldType?: string; structureType?: string; children?: LayoutItem[]; + required?: boolean; + readOnly?: boolean; + placeholder?: string; + helperText?: string; } const LayoutRenderer = ({ items, formData, - onChange + onChange, + fieldErrors }: { items: LayoutItem[], formData: any, - onChange: (field: string, val: any) => void + onChange: (field: string, val: any) => void, + fieldErrors: Map }) => { if (!items) return null; return (
{items.map(item => { + let errorMessage: string | undefined; + let isFieldRequiredAndEmpty = false; + if (item.type === 'field' && item.fieldName) { + errorMessage = fieldErrors.get(item.fieldName); + + if (item.required) { + if (item.fieldType === 'boolean') { + isFieldRequiredAndEmpty = (formData[item.fieldName] === false || formData[item.fieldName] === undefined); + } else { + isFieldRequiredAndEmpty = (!formData[item.fieldName] || formData[item.fieldName] === ''); + } + isFieldRequiredAndEmpty = isFieldRequiredAndEmpty && !errorMessage; + } + return (
- + {item.fieldType === 'boolean' ? ( -
+
onChange(item.fieldName!, e.target.checked)} + disabled={item.readOnly} /> Yes
@@ -61,9 +85,11 @@ const LayoutRenderer = ({ onChange(item.fieldName!, Number(e.target.value))} - placeholder={`Enter ${item.label}`} + placeholder={item.placeholder || `Enter ${item.label}`} + disabled={item.readOnly} + className={isFieldRequiredAndEmpty ? "border-red-500" : ""} /> ) : item.fieldType === 'date' ? ( onChange(item.fieldName!, e.target.value)} + disabled={item.readOnly} + className={isFieldRequiredAndEmpty ? "border-red-500" : ""} /> ) : ( onChange(item.fieldName!, e.target.value)} - placeholder={`Enter ${item.label}`} + placeholder={item.placeholder || `Enter ${item.label}`} + disabled={item.readOnly} + className={isFieldRequiredAndEmpty ? "border-red-500" : ""} /> )} + {item.helperText &&

{item.helperText}

} + {errorMessage &&

{errorMessage}

}
); } @@ -89,7 +121,7 @@ const LayoutRenderer = ({ return (

{item.label}

- +
) } @@ -105,17 +137,20 @@ export const DynamicForm: React.FC = (props) => { const [loading, setLoading] = useState(true); const [formData, setFormData] = useState({}); const [error, setError] = useState(null); + const [fieldErrors, setFieldErrors] = useState>(new Map()); const { t } = useLanguage(); useEffect(() => { if (!props.isOpen) return; - // Reset form data + // Reset form data and errors if (initialData) { setFormData(initialData); } else { setFormData({}); } + setFieldErrors(new Map()); + setError(null); const fetchLayout = async () => { setLoading(true); @@ -130,6 +165,8 @@ export const DynamicForm: React.FC = (props) => { e.preventDefault(); setLoading(true); setError(null); + setFieldErrors(new Map()); // Clear field errors on new submit attempt + try { if (recordId) { await api.put(`/api/v1/data/${schemaName}/${recordId}`, formData); @@ -139,7 +176,18 @@ export const DynamicForm: React.FC = (props) => { props.onSave(); } catch (err: any) { console.error(err); - setError("Failed to save data"); + if (err.response && err.response.data && Array.isArray(err.response.data.errors)) { + const newFieldErrors = new Map(); + err.response.data.errors.forEach((e: any) => { + if (e.field && e.detail) { + newFieldErrors.set(e.field, e.detail); + } + }); + setFieldErrors(newFieldErrors); + setError(err.response.data.message || "Validation failed."); + } else { + setError("Failed to save data"); + } } finally { setLoading(false); } @@ -163,11 +211,12 @@ export const DynamicForm: React.FC = (props) => { {error &&
{error}
} -
+ setFormData((prev: any) => ({ ...prev, [field]: val }))} + fieldErrors={fieldErrors} />
diff --git a/dynaman-ui/src/pages/LayoutDesigner.tsx b/dynaman-ui/src/pages/LayoutDesigner.tsx index 4b6c4cb..9491e76 100644 --- a/dynaman-ui/src/pages/LayoutDesigner.tsx +++ b/dynaman-ui/src/pages/LayoutDesigner.tsx @@ -12,6 +12,7 @@ interface SchemaField { name: string; label: string; field_type: string; + is_required: boolean; } interface Schema { @@ -26,20 +27,25 @@ interface LayoutItem { // Field props fieldName?: string; fieldType?: string; + required?: boolean; + readOnly?: boolean; + placeholder?: string; + helperText?: string; // Structure props structureType?: string; children?: LayoutItem[]; } // Draggable Toolbox Item Component -function ToolboxItem({ id, label, type }: { id: string, label: string, type: string }) { +function ToolboxItem({ id, label, type, isRequired }: { id: string, label: string, type: string, isRequired: boolean }) { const { attributes, listeners, setNodeRef, transform } = useDraggable({ id: id, data: { type: 'field', label, fieldName: id.replace('field-', ''), - fieldType: type + fieldType: type, + required: isRequired } }); @@ -53,9 +59,10 @@ function ToolboxItem({ id, label, type }: { id: string, label: string, type: str style={style} {...listeners} {...attributes} - className="bg-white border p-2 rounded shadow-sm text-sm cursor-grab hover:border-primary touch-none" + className="bg-white border p-2 rounded shadow-sm text-sm cursor-grab hover:border-primary touch-none flex justify-between items-center" > - {label} ({type}) + {label} ({type}) + {isRequired && *}
); } @@ -89,24 +96,46 @@ function ToolboxStructure({ id, label }: { id: string, label: string }) { } // Render Item on Canvas -function CanvasItem({ item, onDelete }: { item: LayoutItem, onDelete: (id: string) => void }) { +function CanvasItem({ + item, + onDelete, + onClick, + isSelected +}: { + item: LayoutItem, + onDelete: (id: string) => void, + onClick: () => void, + isSelected: boolean +}) { + const borderClass = isSelected ? 'border-primary ring-2 ring-primary/20' : 'hover:border-primary'; + if (item.type === 'field') { return ( -
+
{ + e.stopPropagation(); + onClick(); + }} + >
- + {item.fieldType === 'boolean' ? (
- + Checkbox
) : item.fieldType === 'number' ? ( - + ) : item.fieldType === 'date' ? ( - + ) : ( - + )} + {item.helperText &&

{item.helperText}

}
@@ -146,6 +181,7 @@ export default function LayoutDesigner() { const [loading, setLoading] = useState(true); const [activeDragId, setActiveDragId] = useState(null); const [showSettings, setShowSettings] = useState(false); // For settings modal + const [selectedItemId, setSelectedItemId] = useState(null); // Load Schema, Layouts, and Groups useEffect(() => { @@ -179,8 +215,10 @@ export default function LayoutDesigner() { useEffect(() => { if (currentLayout) { setDefinition(currentLayout.definition || []); + setSelectedItemId(null); } else { setDefinition([]); + setSelectedItemId(null); } }, [currentLayout]); @@ -206,9 +244,20 @@ export default function LayoutDesigner() { const handleSave = async () => { if (!currentLayout) return; + + const definitionToSave = definition.map(item => { + if (item.type === 'field' && item.fieldName && schema) { + const correspondingSchemaField = schema.fields.find(f => f.name === item.fieldName); + if (correspondingSchemaField && correspondingSchemaField.is_required) { + return { ...item, required: true }; // Force required true if schema says so + } + } + return item; + }); + try { const updated = await layoutApi.update(currentLayout._id, { - definition: definition, + definition: definitionToSave, target_group_ids: currentLayout.target_group_ids, is_default: currentLayout.is_default }); @@ -235,6 +284,13 @@ export default function LayoutDesigner() { const handleDeleteItem = (id: string) => { setDefinition(prev => prev.filter(item => item.id !== id)); + if (selectedItemId === id) setSelectedItemId(null); + }; + + const updateItem = (id: string, updates: Partial) => { + setDefinition(prev => prev.map(item => + item.id === id ? { ...item, ...updates } : item + )); }; const handleDragStart = (event: DragStartEvent) => { @@ -256,7 +312,8 @@ export default function LayoutDesigner() { fieldName: data.fieldName, fieldType: data.fieldType, structureType: data.structureType, - children: [] + children: [], + required: data.required // Init from drop }; setDefinition(prev => [...prev, newItem]); @@ -273,6 +330,7 @@ export default function LayoutDesigner() {
setSelectedItemId(null)} // Deselect when clicking canvas background > {!currentLayout ? (
Select or create a layout to start editing
@@ -284,7 +342,13 @@ export default function LayoutDesigner() {
)} {definition.map(item => ( - + setSelectedItemId(item.id)} + isSelected={selectedItemId === item.id} + /> ))}
)} @@ -292,6 +356,13 @@ export default function LayoutDesigner() { ); }; + const selectedItem = definition.find(item => item.id === selectedItemId); + const selectedSchemaField = selectedItem && schema + ? schema.fields.find(f => f.name === selectedItem.fieldName) + : null; + + const isSchemaRequired = selectedSchemaField?.is_required || false; + if (loading) return
Loading designer...
; if (!schema) return
Schema not found
; @@ -380,7 +451,13 @@ export default function LayoutDesigner() {

Fields

{schema.fields.map(field => ( - + ))}
@@ -398,9 +475,88 @@ export default function LayoutDesigner() { {/* Right Sidebar: Properties */} -
diff --git a/engine/main.py b/engine/main.py index 67230ec..3c7e6ae 100644 --- a/engine/main.py +++ b/engine/main.py @@ -40,6 +40,10 @@ async def lifespan(app: FastAPI): openapi_url=openapi_url ) +# Import and setup OpenTelemetry +from opentelemetry_config import setup_opentelemetry +setup_opentelemetry(app) + # CORS Configuration origins = [ "http://localhost:5173", # Vite default diff --git a/engine/metadata_context/domain/entities/form_layout.py b/engine/metadata_context/domain/entities/form_layout.py index 15ffb73..7403179 100644 --- a/engine/metadata_context/domain/entities/form_layout.py +++ b/engine/metadata_context/domain/entities/form_layout.py @@ -15,6 +15,13 @@ class LayoutComponent(BaseModel): field_name: Optional[str] = Field(None, alias="fieldName") field_type: Optional[str] = Field(None, alias="fieldType") structure_type: Optional[str] = Field(None, alias="structureType") + + # UI Properties + required: bool = False + read_only: bool = Field(False, alias="readOnly") + placeholder: Optional[str] = None + helper_text: Optional[str] = Field(None, alias="helperText") + props: Dict[str, Any] = Field(default_factory=dict) model_config = ConfigDict(populate_by_name=True) diff --git a/engine/opentelemetry_config.py b/engine/opentelemetry_config.py new file mode 100644 index 0000000..5b6f7de --- /dev/null +++ b/engine/opentelemetry_config.py @@ -0,0 +1,87 @@ +import os +from opentelemetry import trace, metrics +from opentelemetry.sdk.trace import TracerProvider +from opentelemetry.sdk.trace.export import BatchSpanProcessor +from opentelemetry.sdk.resources import Resource +from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter +from opentelemetry.sdk.metrics import MeterProvider +from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader +from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter + +from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor +from opentelemetry.instrumentation.pymongo import PymongoInstrumentor +from opentelemetry.instrumentation.requests import RequestsInstrumentor + + +def setup_opentelemetry(app): + """Configure OpenTelemetry for the application.""" + + if os.environ.get("OTEL_ENABLED") != "true": + print("OpenTelemetry is disabled.") + return + + print("OpenTelemetry is enabled. Initializing...") + # Get the service name from an environment variable, default to 'engine-service' + service_name = os.environ.get("OTEL_SERVICE_NAME", "engine-service") + deployment_environment = os.environ.get("APP_ENVIRONMENT", "unknown") + + # Set up a resource with the service name + resource = Resource(attributes={ + "service.name": service_name, + "deployment.environment": deployment_environment, + }) + + # --- TRACES SETUP --- + # Set up a TracerProvider + tracer_provider = TracerProvider(resource=resource) + trace.set_tracer_provider(tracer_provider) + + # Configure the OTLP exporter to send data to the collector sidecar + # The endpoint is the default for the OTel Collector's gRPC port + otlp_exporter = OTLPSpanExporter( + endpoint=os.environ.get("OTEL_EXPORTER_OTLP_ENDPOINT", "http://localhost:4317"), + insecure=True # Use insecure connection for localhost communication + ) + + # Use a BatchSpanProcessor to send spans in batches + span_processor = BatchSpanProcessor(otlp_exporter) + tracer_provider.add_span_processor(span_processor) + # --------------------- + + # --- METRICS SETUP --- + # Configure the Metric Exporter (pointing to Collector gRPC port) + metric_exporter = OTLPMetricExporter( + endpoint=os.environ.get("OTEL_EXPORTER_OTLP_ENDPOINT", "http://localhost:4317"), + insecure=True + ) + + # Metrics are sent periodically (default is every 60s) + reader = PeriodicExportingMetricReader(metric_exporter) + + # Set up the MeterProvider with the same resource tags + meter_provider = MeterProvider(resource=resource, metric_readers=[reader]) + metrics.set_meter_provider(meter_provider) + # --------------------- + + # Instrument FastAPI + FastAPIInstrumentor.instrument_app(app, tracer_provider=tracer_provider) + + # Instrument Pymongo for MongoDB queries (which covers motor) + def request_hook(span, event): + if span and span.is_recording(): + # Manually ensure New Relic sees the database name + # 'event.database_name' is provided by the pymongo monitoring API + span.set_attribute("db.name", event.database_name) + span.set_attribute("db.namespace", event.database_name) # Newer standard + + # Optional: capture the collection name if missing + if hasattr(event, 'command') and event.command: + collection = event.command.get(event.command_name) + if isinstance(collection, str): + span.set_attribute("db.collection.name", collection) + + # Apply the instrumentation with the hook + PymongoInstrumentor().instrument(tracer_provider=tracer_provider, request_hook=request_hook) + + # Instrument the requests library for any outgoing HTTP calls + RequestsInstrumentor().instrument(tracer_provider=tracer_provider) diff --git a/engine/requirements.txt b/engine/requirements.txt index 7709bcb..27dbe0a 100644 --- a/engine/requirements.txt +++ b/engine/requirements.txt @@ -14,3 +14,11 @@ python-jose[cryptography] mongomock mongomock-motor flake8 + +# OpenTelemetry +opentelemetry-api +opentelemetry-sdk +opentelemetry-exporter-otlp +opentelemetry-instrumentation-fastapi +opentelemetry-instrumentation-requests +opentelemetry-instrumentation-pymongo diff --git a/engine/tests/test_layout_properties.py b/engine/tests/test_layout_properties.py new file mode 100644 index 0000000..99c8da5 --- /dev/null +++ b/engine/tests/test_layout_properties.py @@ -0,0 +1,78 @@ +import pytest +from httpx import AsyncClient, ASGITransport +from main import app +from api.dependencies import verify_token + +@pytest.fixture +async def client(): + async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as ac: + yield ac + +@pytest.mark.asyncio +async def test_create_layout_with_properties(client): + """Test that extended properties (required, readOnly, etc.) are stored correctly""" + payload = { + "schema_name": "customer_extended", + "name": "Property Test View", + "definition": [ + { + "id": "field-1", + "type": "field", + "label": "Email", + "fieldName": "email", + "fieldType": "email", + "required": True, + "readOnly": True, + "placeholder": "Enter email here", + "helperText": "We will not spam you" + } + ] + } + response = await client.post("/api/v1/layouts/", json=payload) + assert response.status_code == 201 + data = response.json() + + item = data["definition"][0] + assert item["required"] is True + assert item["readOnly"] is True + assert item["placeholder"] == "Enter email here" + assert item["helperText"] == "We will not spam you" + +@pytest.mark.asyncio +async def test_update_layout_with_properties(client): + # 1. Create initial layout + create_res = await client.post("/api/v1/layouts/", json={ + "schema_name": "customer_extended", + "name": "Update Test", + "definition": [ + { + "id": "field-1", + "type": "field", + "label": "Name", + "fieldName": "name" + } + ] + }) + layout_id = create_res.json()["_id"] + + # 2. Update with properties + update_payload = { + "definition": [ + { + "id": "field-1", + "type": "field", + "label": "Name", + "fieldName": "name", + "required": True, + "helperText": "Updated helper" + } + ] + } + + response = await client.put(f"/api/v1/layouts/{layout_id}", json=update_payload) + assert response.status_code == 200 + data = response.json() + + item = data["definition"][0] + assert item["required"] is True + assert item["helperText"] == "Updated helper" diff --git a/infrastructure/cdk/lib/dynaman-stack.ts b/infrastructure/cdk/lib/dynaman-stack.ts index 2490034..82dc18d 100644 --- a/infrastructure/cdk/lib/dynaman-stack.ts +++ b/infrastructure/cdk/lib/dynaman-stack.ts @@ -350,25 +350,28 @@ export class DynamanStack extends cdk.Stack { httpListener.addTargets('UiTarget', { port: 80, targets: [uiService], - healthCheck: { path: '/' } + healthCheck: { path: '/' }, + deregistrationDelay: cdk.Duration.seconds(30), }); - // Auth -> /api/v1/auth/* + // Auth -> /api/v1/auth/*, /api/v1/groups, /api/v1/groups/* httpListener.addTargets('AuthTarget', { priority: 100, - conditions: [elbv2.ListenerCondition.pathPatterns(['/api/v1/auth/*'])], + conditions: [elbv2.ListenerCondition.pathPatterns(['/api/v1/auth/*', '/api/v1/groups', '/api/v1/groups/*'])], port: 8000, targets: [authService], - healthCheck: { path: '/health' } + healthCheck: { path: '/health' }, + deregistrationDelay: cdk.Duration.seconds(30), }); - // Meta -> /api/v1/schemas/* + // Meta -> /api/v1/schemas/*, /api/v1/layouts, /api/v1/layouts/* httpListener.addTargets('MetaTarget', { priority: 110, - conditions: [elbv2.ListenerCondition.pathPatterns(['/api/v1/schemas/*'])], + conditions: [elbv2.ListenerCondition.pathPatterns(['/api/v1/schemas/*', '/api/v1/layouts', '/api/v1/layouts/*'])], port: 8000, targets: [metaService], - healthCheck: { path: '/api/v1/schemas/openapi.json' } + healthCheck: { path: '/api/v1/schemas/openapi.json' }, + deregistrationDelay: cdk.Duration.seconds(30), }); // Exec -> /api/v1/data/* @@ -377,7 +380,8 @@ export class DynamanStack extends cdk.Stack { conditions: [elbv2.ListenerCondition.pathPatterns(['/api/v1/data/*'])], port: 8000, targets: [execService], - healthCheck: { path: '/api/v1/data/openapi.json' } + healthCheck: { path: '/api/v1/data/openapi.json' }, + deregistrationDelay: cdk.Duration.seconds(30), }); // Outputs diff --git a/infrastructure/terraform/alb.tf b/infrastructure/terraform/alb.tf index 4ac71eb..fac02b9 100644 --- a/infrastructure/terraform/alb.tf +++ b/infrastructure/terraform/alb.tf @@ -110,6 +110,22 @@ resource "aws_lb_listener" "http" { } # Listener Rules +resource "aws_lb_listener_rule" "config" { + listener_arn = aws_lb_listener.http.arn + priority = 90 + + action { + type = "forward" + target_group_arn = aws_lb_target_group.auth.arn + } + + condition { + path_pattern { + values = ["/api/v1/config/*"] + } + } +} + resource "aws_lb_listener_rule" "auth" { listener_arn = aws_lb_listener.http.arn priority = 100 @@ -121,7 +137,7 @@ resource "aws_lb_listener_rule" "auth" { condition { path_pattern { - values = ["/api/v1/auth/*"] + values = ["/api/v1/auth/*", "/api/v1/groups", "/api/v1/groups/*"] } } } @@ -137,7 +153,7 @@ resource "aws_lb_listener_rule" "engine_meta" { condition { path_pattern { - values = ["/api/v1/schemas/*"] + values = ["/api/v1/schemas/*", "/api/v1/layouts", "/api/v1/layouts/*"] } } } diff --git a/infrastructure/terraform/ecs_cluster.tf b/infrastructure/terraform/ecs_cluster.tf index 85e3c26..e399563 100644 --- a/infrastructure/terraform/ecs_cluster.tf +++ b/infrastructure/terraform/ecs_cluster.tf @@ -40,8 +40,8 @@ resource "aws_launch_template" "ecs" { resource "aws_autoscaling_group" "ecs" { name = "${var.project_name}-asg" vpc_zone_identifier = [aws_subnet.public_1.id, aws_subnet.public_2.id] - desired_capacity = 1 - min_size = 1 + desired_capacity = 2 + min_size = 2 max_size = 2 launch_template { diff --git a/infrastructure/terraform/ecs_services.tf b/infrastructure/terraform/ecs_services.tf index 5d2f5e5..d202a71 100644 --- a/infrastructure/terraform/ecs_services.tf +++ b/infrastructure/terraform/ecs_services.tf @@ -1,3 +1,6 @@ +# Prepare OTel Collector configuration + + # UI Service resource "aws_ecs_task_definition" "ui" { family = "${var.project_name}-ui" @@ -17,8 +20,8 @@ resource "aws_ecs_task_definition" "ui" { { name = "ui" image = aws_ecr_repository.ui.repository_url - cpu = 256 - memory = 256 + cpu = 384 + memory = 384 essential = true portMappings = [ { @@ -35,6 +38,96 @@ resource "aws_ecs_task_definition" "ui" { "awslogs-stream-prefix" = "ui" } } + }, + { + name = "otel-collector-ui" + image = "public.ecr.aws/aws-observability/aws-otel-collector:latest" + cpu = 128 + memory = 128 + essential = true + command = [ + "--config", + yamlencode({ + receivers = { + otlp = { + protocols = { + grpc = { endpoint = "0.0.0.0:4317" } + http = { endpoint = "0.0.0.0:4318" } + } + } + # Only for the UI service; remove from others if not needed + nginx = { + endpoint = "http://localhost:80/nginx_status" + collection_interval = "10s" + } + } + connectors = { + spanmetrics = { + histogram = { + explicit = { + buckets = ["2ms", "6ms", "10ms", "100ms", "250ms", "500ms", "1s", "5s"] + } + } + dimensions = [ + { name = "http.request.method" }, + { name = "http.response.status_code" }, + { name = "deployment.environment" } + ] + } + } + processors = { + batch = { + send_batch_size = 8192 + timeout = "10s" + } + cumulativetodelta = null + resourcedetection = { + detectors = ["env", "system"] + } + } + exporters = { + # Standard OTLP HTTP exporter for New Relic + otlphttp/newrelic = { + endpoint = "https://otlp.nr-data.net" + headers = { + "api-key" = "$${NEW_RELIC_LICENSE_KEY}" + } + } + debug = { + verbosity = "detailed" + } + } + service = { + pipelines = { + traces = { + receivers = ["otlp"] + processors = ["resourcedetection", "batch"] + exporters = ["otlphttp/newrelic", "spanmetrics"] + } + metrics = { + # Add "nginx" here only for the UI task definition + receivers = ["otlp", "spanmetrics", "nginx"] + processors = ["resourcedetection", "cumulativetodelta", "batch"] + exporters = ["otlphttp/newrelic", "debug"] + } + } + } + }) + ] + secrets = [ + { + name = "NEW_RELIC_LICENSE_KEY" + valueFrom = aws_secretsmanager_secret.new_relic_license_key.arn + } + ] + logConfiguration = { + logDriver = "awslogs" + options = { + "awslogs-group" = aws_cloudwatch_log_group.main.name + "awslogs-region" = var.aws_region + "awslogs-stream-prefix" = "otel-collector-ui" + } + } } ]) } @@ -58,8 +151,8 @@ resource "aws_ecs_task_definition" "auth" { family = "${var.project_name}-auth" network_mode = "bridge" requires_compatibilities = ["EC2"] - cpu = 256 - memory = 256 + cpu = 384 + memory = 384 execution_role_arn = aws_iam_role.ecs_task_execution_role.arn task_role_arn = aws_iam_role.ecs_task_role.arn @@ -83,7 +176,10 @@ resource "aws_ecs_task_definition" "auth" { } ] environment = [ - { name = "DATABASE_NAME", value = "dynaman_auth" } + { name = "DATABASE_NAME", value = "dynaman_auth" }, + { name = "OTEL_SERVICE_NAME", value = "auth-service" }, + { name = "OTEL_EXPORTER_OTLP_ENDPOINT", value = "http://localhost:4317" }, + { name = "APP_ENVIRONMENT", value = "production" } ] secrets = [ { name = "MONGODB_URL", valueFrom = aws_secretsmanager_secret.mongodb_url.arn }, @@ -97,6 +193,96 @@ resource "aws_ecs_task_definition" "auth" { "awslogs-stream-prefix" = "auth" } } + }, + { + name = "otel-collector" + image = "public.ecr.aws/aws-observability/aws-otel-collector:latest" + cpu = 128 + memory = 128 + essential = true + command = [ + "--config", + yamlencode({ + receivers = { + otlp = { + protocols = { + grpc = { endpoint = "0.0.0.0:4317" } + http = { endpoint = "0.0.0.0:4318" } + } + } + # Only for the UI service; remove from others if not needed + nginx = { + endpoint = "http://localhost:80/nginx_status" + collection_interval = "10s" + } + } + connectors = { + spanmetrics = { + histogram = { + explicit = { + buckets = ["2ms", "6ms", "10ms", "100ms", "250ms", "500ms", "1s", "5s"] + } + } + dimensions = [ + { name = "http.request.method" }, + { name = "http.response.status_code" }, + { name = "deployment.environment" } + ] + } + } + processors = { + batch = { + send_batch_size = 8192 + timeout = "10s" + } + cumulativetodelta = null + resourcedetection = { + detectors = ["env", "system"] + } + } + exporters = { + # Standard OTLP HTTP exporter for New Relic + otlphttp/newrelic = { + endpoint = "https://otlp.nr-data.net" + headers = { + "api-key" = "$${NEW_RELIC_LICENSE_KEY}" + } + } + debug = { + verbosity = "detailed" + } + } + service = { + pipelines = { + traces = { + receivers = ["otlp"] + processors = ["resourcedetection", "batch"] + exporters = ["otlphttp/newrelic", "spanmetrics"] + } + metrics = { + # Add "nginx" here only for the UI task definition + receivers = ["otlp", "spanmetrics", "nginx"] + processors = ["resourcedetection", "cumulativetodelta", "batch"] + exporters = ["otlphttp/newrelic", "debug"] + } + } + } + }) + ] + secrets = [ + { + name = "NEW_RELIC_LICENSE_KEY" + valueFrom = aws_secretsmanager_secret.new_relic_license_key.arn + } + ] + logConfiguration = { + logDriver = "awslogs" + options = { + "awslogs-group" = aws_cloudwatch_log_group.main.name + "awslogs-region" = var.aws_region + "awslogs-stream-prefix" = "otel-collector-auth" + } + } } ]) } @@ -120,8 +306,8 @@ resource "aws_ecs_task_definition" "meta" { family = "${var.project_name}-meta" network_mode = "bridge" requires_compatibilities = ["EC2"] - cpu = 256 - memory = 256 + cpu = 384 + memory = 384 execution_role_arn = aws_iam_role.ecs_task_execution_role.arn task_role_arn = aws_iam_role.ecs_task_role.arn @@ -146,7 +332,10 @@ resource "aws_ecs_task_definition" "meta" { ] environment = [ { name = "DATABASE_NAME", value = "dynaman" }, - { name = "APP_MODE", value = "metadata" } + { name = "APP_MODE", value = "metadata" }, + { name = "OTEL_SERVICE_NAME", value = "engine-meta" }, + { name = "OTEL_EXPORTER_OTLP_ENDPOINT", value = "http://localhost:4317" }, + { name = "APP_ENVIRONMENT", value = "production" } ] secrets = [ { name = "MONGODB_URL", valueFrom = aws_secretsmanager_secret.mongodb_url.arn }, @@ -160,6 +349,96 @@ resource "aws_ecs_task_definition" "meta" { "awslogs-stream-prefix" = "meta" } } + }, + { + name = "otel-collector" + image = "public.ecr.aws/aws-observability/aws-otel-collector:latest" + cpu = 128 + memory = 128 + essential = true + command = [ + "--config", + yamlencode({ + receivers = { + otlp = { + protocols = { + grpc = { endpoint = "0.0.0.0:4317" } + http = { endpoint = "0.0.0.0:4318" } + } + } + # Only for the UI service; remove from others if not needed + nginx = { + endpoint = "http://localhost:80/nginx_status" + collection_interval = "10s" + } + } + connectors = { + spanmetrics = { + histogram = { + explicit = { + buckets = ["2ms", "6ms", "10ms", "100ms", "250ms", "500ms", "1s", "5s"] + } + } + dimensions = [ + { name = "http.request.method" }, + { name = "http.response.status_code" }, + { name = "deployment.environment" } + ] + } + } + processors = { + batch = { + send_batch_size = 8192 + timeout = "10s" + } + cumulativetodelta = null + resourcedetection = { + detectors = ["env", "system"] + } + } + exporters = { + # Standard OTLP HTTP exporter for New Relic + otlphttp/newrelic = { + endpoint = "https://otlp.nr-data.net" + headers = { + "api-key" = "$${NEW_RELIC_LICENSE_KEY}" + } + } + debug = { + verbosity = "detailed" + } + } + service = { + pipelines = { + traces = { + receivers = ["otlp"] + processors = ["resourcedetection", "batch"] + exporters = ["otlphttp/newrelic", "spanmetrics"] + } + metrics = { + # Add "nginx" here only for the UI task definition + receivers = ["otlp", "spanmetrics", "nginx"] + processors = ["resourcedetection", "cumulativetodelta", "batch"] + exporters = ["otlphttp/newrelic", "debug"] + } + } + } + }) + ] + secrets = [ + { + name = "NEW_RELIC_LICENSE_KEY" + valueFrom = aws_secretsmanager_secret.new_relic_license_key.arn + } + ] + logConfiguration = { + logDriver = "awslogs" + options = { + "awslogs-group" = aws_cloudwatch_log_group.main.name + "awslogs-region" = var.aws_region + "awslogs-stream-prefix" = "otel-collector-meta" + } + } } ]) } @@ -183,8 +462,8 @@ resource "aws_ecs_task_definition" "exec" { family = "${var.project_name}-exec" network_mode = "bridge" requires_compatibilities = ["EC2"] - cpu = 256 - memory = 256 + cpu = 384 + memory = 384 execution_role_arn = aws_iam_role.ecs_task_execution_role.arn task_role_arn = aws_iam_role.ecs_task_role.arn @@ -209,7 +488,10 @@ resource "aws_ecs_task_definition" "exec" { ] environment = [ { name = "DATABASE_NAME", value = "dynaman" }, - { name = "APP_MODE", value = "execution" } + { name = "APP_MODE", value = "execution" }, + { name = "OTEL_SERVICE_NAME", value = "engine-exec" }, + { name = "OTEL_EXPORTER_OTLP_ENDPOINT", value = "http://localhost:4317" }, + { name = "APP_ENVIRONMENT", value = "production" } ] secrets = [ { name = "MONGODB_URL", valueFrom = aws_secretsmanager_secret.mongodb_url.arn }, @@ -223,6 +505,96 @@ resource "aws_ecs_task_definition" "exec" { "awslogs-stream-prefix" = "exec" } } + }, + { + name = "otel-collector" + image = "public.ecr.aws/aws-observability/aws-otel-collector:latest" + cpu = 128 + memory = 128 + essential = true + command = [ + "--config", + yamlencode({ + receivers = { + otlp = { + protocols = { + grpc = { endpoint = "0.0.0.0:4317" } + http = { endpoint = "0.0.0.0:4318" } + } + } + # Only for the UI service; remove from others if not needed + nginx = { + endpoint = "http://localhost:80/nginx_status" + collection_interval = "10s" + } + } + connectors = { + spanmetrics = { + histogram = { + explicit = { + buckets = ["2ms", "6ms", "10ms", "100ms", "250ms", "500ms", "1s", "5s"] + } + } + dimensions = [ + { name = "http.request.method" }, + { name = "http.response.status_code" }, + { name = "deployment.environment" } + ] + } + } + processors = { + batch = { + send_batch_size = 8192 + timeout = "10s" + } + cumulativetodelta = null + resourcedetection = { + detectors = ["env", "system"] + } + } + exporters = { + # Standard OTLP HTTP exporter for New Relic + otlphttp/newrelic = { + endpoint = "https://otlp.nr-data.net" + headers = { + "api-key" = "$${NEW_RELIC_LICENSE_KEY}" + } + } + debug = { + verbosity = "detailed" + } + } + service = { + pipelines = { + traces = { + receivers = ["otlp"] + processors = ["resourcedetection", "batch"] + exporters = ["otlphttp/newrelic", "spanmetrics"] + } + metrics = { + # Add "nginx" here only for the UI task definition + receivers = ["otlp", "spanmetrics", "nginx"] + processors = ["resourcedetection", "cumulativetodelta", "batch"] + exporters = ["otlphttp/newrelic", "debug"] + } + } + } + }) + ] + secrets = [ + { + name = "NEW_RELIC_LICENSE_KEY" + valueFrom = aws_secretsmanager_secret.new_relic_license_key.arn + } + ] + logConfiguration = { + logDriver = "awslogs" + options = { + "awslogs-group" = aws_cloudwatch_log_group.main.name + "awslogs-region" = var.aws_region + "awslogs-stream-prefix" = "otel-collector-exec" + } + } } ]) } diff --git a/infrastructure/terraform/iam.tf b/infrastructure/terraform/iam.tf index da1c8d0..0a71be6 100644 --- a/infrastructure/terraform/iam.tf +++ b/infrastructure/terraform/iam.tf @@ -21,9 +21,9 @@ resource "aws_iam_role_policy_attachment" "ecs_task_execution_role_policy" { policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy" } -# Allow ECS Execution Role to read SSM Parameters -resource "aws_iam_role_policy" "ecs_ssm_secrets" { - name = "${var.project_name}-ecs-ssm-secrets" +# Allow ECS Execution Role to access secrets from Secrets Manager +resource "aws_iam_role_policy" "ecs_secrets_manager_access" { + name = "${var.project_name}-ecs-secrets-manager-access" role = aws_iam_role.ecs_task_execution_role.id policy = jsonencode({ @@ -31,13 +31,11 @@ resource "aws_iam_role_policy" "ecs_ssm_secrets" { Statement = [ { Effect = "Allow" - Action = [ - "ssm:GetParameters", - "ssm:GetParameter" - ] + Action = "secretsmanager:GetSecretValue" Resource = [ - aws_ssm_parameter.mongodb_url.arn, - aws_ssm_parameter.jwt_secret_key.arn + aws_secretsmanager_secret.mongodb_url.arn, + aws_secretsmanager_secret.jwt_secret_key.arn, + aws_secretsmanager_secret.new_relic_license_key.arn ] } ] diff --git a/infrastructure/terraform/secrets.tf b/infrastructure/terraform/secrets.tf index 502120e..59b33aa 100644 --- a/infrastructure/terraform/secrets.tf +++ b/infrastructure/terraform/secrets.tf @@ -25,3 +25,19 @@ resource "aws_secretsmanager_secret_version" "jwt_secret_key" { secret_id = aws_secretsmanager_secret.jwt_secret_key.id secret_string = var.jwt_secret_key } + +resource "aws_secretsmanager_secret" "new_relic_license_key" { + name = "/${var.project_name}/${var.environment}/NEW_RELIC_LICENSE_KEY" + description = "New Relic License Key for OpenTelemetry" + + tags = { + Environment = var.environment + } +} + +resource "aws_secretsmanager_secret_version" "new_relic_license_key" { + secret_id = aws_secretsmanager_secret.new_relic_license_key.id + secret_string = var.new_relic_license_key +} + + diff --git a/infrastructure/terraform/variables.tf b/infrastructure/terraform/variables.tf index 6241cbf..992bd51 100644 --- a/infrastructure/terraform/variables.tf +++ b/infrastructure/terraform/variables.tf @@ -43,3 +43,12 @@ variable "jwt_secret_key" { type = string sensitive = true } + +variable "new_relic_license_key" { + description = "New Relic License Key for sending OTel data." + type = string + sensitive = true + default = "YOUR_NEW_RELIC_LICENSE_KEY_PLACEHOLDER" +} + + diff --git a/nginx-gateway.conf b/nginx-gateway.conf index ba78a5c..2bd2635 100644 --- a/nginx-gateway.conf +++ b/nginx-gateway.conf @@ -34,6 +34,12 @@ http { proxy_set_header Authorization $http_authorization; } + location /api/v1/config/ { + proxy_pass http://auth-service:8000; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + } + location /api/v1/layouts { proxy_pass http://engine-metadata:8000; proxy_set_header Host $host; diff --git a/otel-collector-local-config.yaml b/otel-collector-local-config.yaml new file mode 100644 index 0000000..94db93e --- /dev/null +++ b/otel-collector-local-config.yaml @@ -0,0 +1,63 @@ +receivers: + otlp: + protocols: + grpc: + endpoint: "0.0.0.0:4317" + http: + endpoint: "0.0.0.0:4318" + cors: + allowed_origins: + - "http://localhost:3000" + - "http://localhost:8000" + - "http://localhost:5173" + # Added to prevent CORS preflight blocks from browsers + allowed_headers: + - "content-type" + - "x-otlp-version" + + # NEW: Scrapes Nginx status from your UI container + nginx: + endpoint: "http://dynaman-ui:80/nginx_status" + collection_interval: 10s + +connectors: + spanmetrics: + histogram: + explicit: + buckets: [2ms, 6ms, 10ms, 100ms, 250ms, 500ms, 1s, 5s] + dimensions: + - name: http.request.method + - name: http.response.status_code + - name: deployment.environment + +processors: + batch: + send_batch_size: 8192 + timeout: 10s + + cumulativetodelta: + + resourcedetection: + detectors: [env, system] + +exporters: + otlphttp/newrelic: + endpoint: "https://otlp.nr-data.net" + headers: + "api-key": "${NEW_RELIC_LICENSE_KEY}" + + debug: + verbosity: detailed + +service: + pipelines: + traces: + receivers: [otlp] + processors: [resourcedetection, batch] + exporters: [otlphttp/newrelic, spanmetrics] + + metrics: + # Added 'nginx' to the receivers list + receivers: [otlp, spanmetrics, nginx] + processors: [resourcedetection, cumulativetodelta, batch] + exporters: [otlphttp/newrelic, debug] \ No newline at end of file