Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predogma patch 2 #441

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
4e655af
Add Fleet & Agent 8.8.2 release notes (#275) (#300)
kilfoyle Jun 29, 2023
c7c65df
Update fleet-settings.asciidoc (#303) (#305)
kilfoyle Jul 4, 2023
09941f6
Fleet Server Scaling should contain the note about max 500 policies (…
mergify[bot] Jul 7, 2023
8807c9e
We are missing the mentioning of artifacts.elastic.co in our overview…
mergify[bot] Jul 7, 2023
6362458
Update API generation docs readme to refer to ingest-docs repo (#316)…
mergify[bot] Jul 10, 2023
2223fb6
Update README.md with correct paths for new ingest-docs repo (#319) (…
mergify[bot] Jul 11, 2023
f698a3d
Add steps to generate Elastic Agent diagnostic from Fleet in the trou…
mergify[bot] Jul 11, 2023
63d4dda
Add docs for upgrade package verification with alternate PGP keys (#3…
mergify[bot] Jul 19, 2023
85d92fe
Add Fleet & Agent 8.9.0 release notes (#341) (#343)
mergify[bot] Jul 24, 2023
0d6a657
Update recommended Disk and RSS Mem Size settings for Elastic Agent (…
mergify[bot] Jul 26, 2023
4947450
Add updated proxy settings docs (#271) (#356)
mergify[bot] Jul 27, 2023
46f17b4
Update install-elastic-agent.asciidoc
emrcbrn Jul 28, 2023
18821b4
[8.9] Remove tested version number from agent minimum install reqs (b…
mergify[bot] Jul 28, 2023
da9276d
for connectivity to LS we need to have a cert defined (#377) (#381)
mergify[bot] Aug 2, 2023
8b423b7
Note that agent needs to be restarted if the upgrade happened via RPM…
mergify[bot] Aug 2, 2023
aee97a9
Need more clarity for the fleet server scale documentation (#383) (#387)
mergify[bot] Aug 3, 2023
0f27e6e
adding docs for default container in 8.9 (#386)
gizas Aug 4, 2023
837d183
Fixing typos in scaling-on-kubernetes.asciidoc (#396)
daverick Aug 7, 2023
7b6435a
Fix images on Elastic Agent K8s Scaling page (#397)
kilfoyle Aug 7, 2023
10ec97f
Add an introduction to `FLEET_SERVER_ELASTICSEARCH_CA_TRUSTED_FINGERP…
mergify[bot] Aug 14, 2023
db5b01c
Include profiling in air-gapped env (#412) (#415)
mergify[bot] Aug 16, 2023
c10bb2f
Add Fleet & Agent 8.9.1 release notes (#403) (#414)
mergify[bot] Aug 16, 2023
24e1967
Adds tip about creating Elastic Defend policy using Fleet API (#417) …
mergify[bot] Aug 21, 2023
5cdc1c4
Update air-gapped.asciidoc
predogma Aug 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 14 additions & 1 deletion docs/en/ingest-management/commands.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -461,7 +461,8 @@ manager like systemd.

IMPORTANT: If you installed {agent} from a DEB or RPM package, the `install`
command will skip the installation itself and function as an alias of the
<<elastic-agent-enroll-command,`enroll` command>> instead.
<<elastic-agent-enroll-command,`enroll` command>> instead. Note that after
an upgrade of the {agent} using DEB or RPM the {agent} service needs to be restarted.

You must run this command as the root user (or Administrator on Windows)
to write files to the correct locations. This command overwrites the
Expand Down Expand Up @@ -825,9 +826,21 @@ The version of {agent} to upgrade to.
The source URI to download the new version from. By default, {agent} uses the
Elastic Artifacts URL.

`--skip-verify`::
Skip the package verification process. This option is not recommended as it is insecure.

`--pgp-path <string>`::
Use a locally stored copy of the PGP key to verify the upgrade package.

`--pgp-uri <string>`::
Use the specified online PGP key to verify the upgrade package.

`--help`::
Show help for the `upgrade` command.

For details about using the `--skip-verify`, `--pgp-path <string>`, and `--pgp-uri <string>`
package verification options, refer to <<upgrade-standalone-verify-package>>.

{global-flags-link}

[discrete]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ and avoid placing passwords in plain text.

The stream to use for logs collection, for example, stdout/stderr.

If the specified package has no logs support, a generic container's logs input will be used as a fallback.
If the specified package has no logs support, a generic container's logs input will be used as a fallback. See the `Hints autodiscovery for kubernetes log collection` example below.

[discrete]
== Available packages that support hints autodiscovery
Expand All @@ -92,7 +92,7 @@ https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-age
[discrete]
== Configure hints autodiscovery

To enable hints, you must add `hints.enabled: true` to the provider's configuration:
To enable hints autodiscovery, you must add `hints.enabled: true` to the provider's configuration:

[source,yaml]
----
Expand Down Expand Up @@ -134,8 +134,15 @@ initContainers:
mountPath: /etc/elastic-agent/inputs.d
----


NOTE: The {agent} can load multiple configuration files from `{path.config}/inputs.d` and finally produce a unified one (refer to <<elastic-agent-configuration>>). Users have the ability to manually mount their own templates under `/etc/elastic-agent/inputs.d` *if they want to skip enabling initContainers section*.


[discrete]
== Examples:

[discrete]
== Example: Hints autodiscovery
=== Hints autodiscovery for redis

Enabling hints allows users deploying Pods on the cluster to automatically turn on Elastic
monitoring at Pod deployment time.
Expand Down Expand Up @@ -164,6 +171,115 @@ After deploying this Pod, the data will start flowing in automatically. You can
NOTE: All assets (dashboards, ingest pipelines, and so on) related to the Redis integration are not installed. You need to explicitly <<install-uninstall-integration-assets,install them through {kib}>>.


[discrete]
=== Hints autodiscovery for kubernetes log collection

The log collection for Kubernetes autodiscovered pods can be supported by using https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone/templates.d/container_logs.yml[container_logs.yml template]. Elastic Agent needs to emit a container_logs mapping so as to start collecting logs for all the discovered containers *even if no annotations are present in the containers*.

1. Follow steps described above to enable Hints Autodiscover
2. Make sure that relevant `container_logs.yml` template will be mounted under /etc/elastic-agent/inputs.d/ folder of Elastic Agent
3. Deploy Elastic Agent Manifest
4. Elastic Agent should be able to discover all containers inside kuernetes cluster and to collect available logs.

The previous default behaviour can be disabled with `hints.default_container_logs: false`.
So this will disable the automatic logs collection from all discovered pods. Users need specifically to annotate their pod with following annotations:

[source,yaml]
----
annotations:
co.elastic.hints/package: "container_logs"
----


[source,yaml]
----
providers.kubernetes:
node: ${NODE_NAME}
scope: node
hints:
enabled: true
default_container_logs: false
...
----

In the following sample nginx manifest, we will additionally provide specific stream annotation, in order to configure the filestream input to read only stderr stream:

[source,yaml]
----
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
namespace: default
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
annotations:
co.elastic.hints/package: "container_logs"
co.elastic.hints/stream: "stderr"
spec:
containers:
- image: nginx
name: nginx
...
----

Users can monitor the final rendered Elastic Agent configuration:

[source,bash]
----
kubectl exec -ti -n kube-system elastic-agent-7fkzm -- bash


/usr/share/elastic-agent# /elastic-agent inspect -v --variables --variables-wait 2s

inputs:
- data_stream.namespace: default
id: hints-container-logs-3f69573a1af05c475857c1d0f98fc55aa01b5650f146d61e9653a966cd50bd9c-kubernetes-1780aca0-3741-4c8c-aced-b9776ba3fa81.nginx
name: filestream-generic
original_id: hints-container-logs-3f69573a1af05c475857c1d0f98fc55aa01b5650f146d61e9653a966cd50bd9c
[output truncated ....]
streams:
- data_stream:
dataset: kubernetes.container_logs
type: logs
exclude_files: []
exclude_lines: []
parsers:
- container:
format: auto
stream: stderr
paths:
- /var/log/containers/*3f69573a1af05c475857c1d0f98fc55aa01b5650f146d61e9653a966cd50bd9c.log
prospector:
scanner:
symlinks: true
tags: []
type: filestream
use_output: default
outputs:
default:
hosts:
- https://elasticsearch:9200
password: changeme
type: elasticsearch
username: elastic
providers:
kubernetes:
hints:
default_container_logs: false
enabled: true
node: control-plane
scope: node
----


[discrete]
== Troubleshooting
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,8 @@ include::shared-env.asciidoc[tag=fleet-server-cert-key-passphrase]

include::shared-env.asciidoc[tag=fleet-server-insecure-http]

include::shared-env.asciidoc[tag=fleet-server-es-ca-trusted-fingerprint]

|===

[discrete]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,21 @@ Setting this to `true` is not recommended.

// =============================================================================

// tag::fleet-server-es-ca-trusted-fingerprint[]
|
[id="env-{type}-fleet-server-es-ca-trusted-fingerprint"]
`FLEET_SERVER_ELASTICSEARCH_CA_TRUSTED_FINGERPRINT`

| (string) The SHA-256 fingerprint (hash) of the certificate authority used to self-sign {es} certificates.
This fingerprint is used to verify self-signed certificates presented by {fleet-server} and any inputs started
by {agent} for communication. This flag is required when using self-signed certificates with {es}.

*Default:* `""`

// end::fleet-server-es-ca-trusted-fingerprint[]

// =============================================================================

// tag::fleet-enroll[]
|
[id="env-{type}-fleet-enroll"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,13 +55,13 @@ Refer to:
== Minimum Requirements

// lint ignore 2vcpu 1gb
Minimum requirements have been determined by running the {agent} (`v8.0.0`) on a GCP `e2-micro` instance (2vCPU/1GB).
Minimum requirements have been determined by running the {agent} on a GCP `e2-micro` instance (2vCPU/1GB).
The {agent} used the default policy, running the system integration and self-monitoring.

// lint ignore mem
|===
| **CPU** | Under 2% total, including all monitoring processes
| **Disk** | 640 MB
| **RSS Mem Size** | 300 MB
| **Disk** | 1.7 GB
| **RSS Mem Size** | 400 MB
|===
Adding integrations will increase the memory used by the agent and its processes.
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ Additionally, by default one agent is elected as **leader** (for more informatio

--
[role="screenshot"]
image::../images/k8sscaling.png[{agent} as daemonset]
image::images/k8sscaling.png[{agent} as daemonset]
--

The above schema explains how {agent} collects and sends metrics to {es}. Because of Leader Agent being responsible to also collecting cluster-lever metrics, this means that it requires additional resources.
Expand Down Expand Up @@ -88,7 +88,7 @@ Sample Elastic Agent Configurations:
[discrete]
=== Proposed Agent Installations for large scale

Although daemonset installation is simple, it can not accomodate the varying agent resource requirements depending on the collected metrics. The need for appropriate resource assignment at large scale requires more granular installation methods.
Although daemonset installation is simple, it can not accommodate the varying agent resource requirements depending on the collected metrics. The need for appropriate resource assignment at large scale requires more granular installation methods.

{agent} deployment is broken in groups as follows:

Expand All @@ -98,7 +98,7 @@ Although daemonset installation is simple, it can not accomodate the varying age

- kube-state-metrics shards and {agent}s in the StatefulSet defined in the kube-state-metrics autosharding manifest

Each of these groups of {agent}s will have its own policy specific to its function and can be resourced independently in the appropriate manifest to accomodate its specific resource requirements.
Each of these groups of {agent}s will have its own policy specific to its function and can be resourced independently in the appropriate manifest to accommodate its specific resource requirements.

Resource assignment led us to alternatives installation methods.

Expand All @@ -124,7 +124,7 @@ Based on our https://github.com/elastic/elastic-agent/blob/ksmsharding/docs/elas

> The tests above were performed with {agent} version 8.8 + TSDB Enabled and scraping period of `10sec` (for the {k8s} integration). Those numbers are just indicators and should be validated per different {k8s} policy configuration, along with applications that the {k8s} cluster might include

NOTE: Tests have run until 10K pods per cluster. Scaling to bigger number of pods might require additional confguration from {k8s} Side and Cloud Providers but the basic idea of installing {agent} while horizontally scaling KSM remains the same.
NOTE: Tests have run until 10K pods per cluster. Scaling to bigger number of pods might require additional configuration from {k8s} Side and Cloud Providers but the basic idea of installing {agent} while horizontally scaling KSM remains the same.

[discrete]
[[agent-scheduling]]
Expand All @@ -137,15 +137,15 @@ Trying to prioritise the agent installation before rest of application microserv
[discrete]
=== {k8s} Policy Configuration

Policy configuration of {k8s} package can heavily affect the amount of metrics collected and finally ingested. Factors that should be considered in order to make your collection and ingestin lighter:
Policy configuration of {k8s} package can heavily affect the amount of metrics collected and finally ingested. Factors that should be considered in order to make your collection and ingestion lighter:

- Scraping period of {k8s} endpoints
- Disabling log collection
- Keep audit logs disabled
- Disable events dataset
- Disable {k8s} control plane datasets in Cloud managed {k8s} instances (see more info ** <<running-on-gke-managed-by-fleet>>, <<running-on-eks-managed-by-fleet>>, <<running-on-aks-managed-by-fleet>> pages)

User experience regarding Dashboard responses, is also affected from the size of data being requested. As dashbords can contain multiple visualisations, the general conisderation is to split visualisasations and group them according to the frequency of access. The less number of visualisations tends to improve user experience.
User experience regarding Dashboard responses, is also affected from the size of data being requested. As dashboards can contain multiple visualisations, the general consideration is to split visualisations and group them according to the frequency of access. The less number of visualisations tends to improve user experience.

Additionally, https://github.com/elastic/integrations/blob/main/docs/dashboard_guidelines.md[Dashboard Guidelines] is constantly updated also to track needs of observability at scale.

Expand All @@ -171,7 +171,7 @@ If {agent} is configured as managed, in {kib} you can observe under **Fleet>Agen

--
[role="screenshot"]
image::../images/agent-status.png[{agent} Status]
image::images/agent-status.png[{agent} Status]
--

Additionally you can verify the process status with following commands:
Expand Down Expand Up @@ -229,7 +229,7 @@ kubectl logs -n kube-system elastic-agent-qw6f4 | grep "kubernetes/metrics"

------------------------------------------------

You can verify the instant resource consumption by running `top pod` command and indentify if agents are close to the limits you have specified in your manifest.
You can verify the instant resource consumption by running `top pod` command and identify if agents are close to the limits you have specified in your manifest.

[source,bash]
------------------------------------------------
Expand All @@ -248,13 +248,13 @@ elastic-agent-wvmpj 27m 35
Filter for Pod dataset:
--
[role="screenshot"]
image::../images/pod-latency.png[{k8s} Pod Metricset]
image::images/pod-latency.png[{k8s} Pod Metricset]
--

Filter for State_Pod dataset
--
[role="screenshot"]
image::../images/state-pod.png[{k8s} State Pod Metricset]
image::images/state-pod.png[{k8s} State Pod Metricset]
--

Identify how many events have been sent to {es}:
Expand Down
Loading