A debate about the SOC’s future between Anton Chuvakin, Security Strategist at Google, and Carson Zimmerman, Security Engineer at Microsoft was hosted by one of our competitor last week. Since great content and vision are universal, we decided to summarize it.
The debate was divided into five different thematics, and speakers made lots of insightful comments on each of these.
Carson Zimmerman (CZ): Microsoft, security engineer. Ten strategies of a world-class cybersecurity operation center
Anton Chuvakin (AC): Google, security strategist. Former job at Gartner. Security Warrior: Know your Enemy
Nimmy Reichenberg (NR): CMO at Siemplify
Main takeaways to understand the debate
1) SIEM is dead
As a function, SIEM (Security Information and Event Management) won’t die. There is a need for telemetry. What matters is the integration between every technology that analysts use and a coherent architecture. Although suppressing it isn’t conceivable, people need to reduce the panes of glass.
3) SOC is dead
People need to stop thinking about SOCs (Security Operations center) as tiers or teams inside a specific room. There is a need for collaboration across the whole SOC. People need to strengthen the feedback loop, the cohesion, and coherence between every agent. Walls need to be deconstructed.
4) Humans are dead
New tools, such as ML/AI, are attractive. Alas, they don’t show the same level of transparency, trust as old tools (snoring). To achieve the same level of confidence, analysts should be able to learn the techniques.
About automation in general, the aim is to grow the human per terabytes of data ratio.
There are some cases where automation is a must: worms, for example. People need a faster-than-human reaction.
There are two buckets AI/ML and human. People have to determine which problem falls in which bucket. This is the right investment and the right factors of success to make ML/AI work. SOAR, ML, SIEM and big data platforms vendors often miss that point.
2) Tier 1 is dead
Tiers are notions that are challenged. Some organizations tend to rotate agents between missions. Others have implemented an SRE logic among SOCs (detection engineers develop solutions to solve detection challenges at scale. They also manage the alerts their solutions create).
What about fully automating tier 1? Tier 1 tasks are tedious but valuable to train juniors agents. The aim is ultimately to make every analysts’ life more manageable and maintain ways to attract, recruit and train juniors. Companies need to grow their workforce, show career progression. Otherwise, they’ll end up having tier 2 and 3 analysts and no tier 1 juniors.
5) EDR killed NDR
The pendulum is swinging in favor of EDR. Alas, in the long term, both technologies could be pushed into niches because of the cloud particularities. At some point, there will no longer have those network choke pointing ideal for NDR to monitor trafficking as in former networks nor a machine to put an agent on soon. Will there be a cloud-native feature that enables analysts to imitate netflow telemetry?
Now that you have all the keys, let’s dive into the debate !
Table of Contents
How is SOC evolving?
SOCs as center of expertise
As a center of expertise, there will always be a SOC. Constituency can differ from one company to another – back-office or remote, for example – but the need for a team is still here.
Physical SOCs under pressure
On top of that, the pandemic affected SOCs. People who were attached to SOCs as a physical space endured stress during the pandemic. In contrast, others, considering the SOC as a team not necessarily working in a physical room, were more agile to adapt to this situation. Yes, it’s about adaptation.
The main point is collaboration and cohesion across the team. It’s where physical SOCs endured stress when forced to work remotely.
Plus, there is a concern about the very structure of the SOC today. Can we still structure SOCs according to a tiered scale?
What is the impact of cloud migration on security tools?
A) Are SIEMs challenged as a function or a technology?
SIEM is at the heart of the SOC, and it has been around for 25 years. But today, do SOCs still need a SIEM? Can they substitute it? What should the SIEM be used for?
A SIEM is a set of technologies and capabilities brought together to collect security-relevant telemetry. Persist it to support detection and correlation on that telemetry. Provide a rich analytic framework, including both off-the-shelf detection and the ability to modify those detections or create new ones and all the things necessary to support the alerts that come out: filtering, enrichment, down selecting, de-duplication—finally, the analyst’s ability to undertake this position and escalate the alert.
Are SIEMs going to be replaced?
The question is not the SIEM, but what set of technology are they going to use to accomplish that outcome described as the function of the SIEM? Twenty years ago, for many SOCs, the only option was to go by a SIEM product, implement it and leverage it. Today, however, there are more options.
In the end, a SIEM is a set of technologies gathered to analyze data, and this function cannot die.
But, of course, every particular technology can die. Today, XDR is presented as one current popular potential SIEM killer. But, the chances of SIEM eating XDR are high.
SIEMS as a way to build a coherent architecture
One of the original values of a SIEM was to reduce the panes of glass for the analyst if well integrated with the other tools indigenous of the SOC (although the single pane of glass as promised by vendors pretty much never existed!). Multiple combinations exist: SIEM+ML or SIEM+ EDR, and so on. The point is that a coherent architecture is needed. Could analysts compose an architecture with tools that do not come from the SIEM vendor? It’s possible; SOARs could be part of that.
Are on-premise SIEMs going to die?
One of the significant trends of the SIEM market today is cloud migration. Most new SIEM implementations are on the cloud. Is on-premise dead or dying? Is there still a practical use to deploy a SIEM on-premise?
What is meant when people are talking about cloud SIEM? A sliding scale. Technologies were used before using the name cloud to designate someone else computer with some ways to integrate data from other computers. Today it is called a cloud. The question is, how are these techniques blended together?
In hybrid schemes, you don’t have control over all tools. This is not truly a continuous scale. People want appliances to their data servers, and many still want physical servers and their data servers.
In the long term? On-premise will maybe die, although there will be discontinuities. For example, about sovereignty, sometimes people in Europe don’t want to use the cloud from a company not based in Europe. Moreover, sometimes analysts need to build a data set that you can’t send on the cloud.
All in all, the on-premise will die, maybe (or relegated as a niche technology), when the cloud will deliver on data analytics on a whole other scale: analytics advantages, shared data analysis, TI advantages. SIEM on-premise will be too far behind, and it would be like choosing between using a spaceship and bike and horses.
B) Between NDR and EDR: Who wins?
The current balance is in favor of EDR
Priorities have changed. In 2013-14, people would fight not to deploy agents. It was like a tsunami hitting their systems: crashes, blue screens, kernel panics. Even though people hated the End Point approach in the first years, it ultimately won.
The re-balance of the EDR and Endpoint approach is pushing the Network to an auxiliary role. Though it did not kill NDR, the pendulum is still swinging.
Ultimately, between EDR and NDR, logs’ analytics would win. Why do we still run network sensors? Do you have significant choke-points where networks sensor make sense? As we move to cloud services, all the analysts have are logs and no network to put sensors on. Hosts are not managed by the constituencies that the SOC serves.
Ultimately, EDR and NDR should both fear the migration to the cloud
It won’t be a surprise if, in the medium term, EDR and NDR would both be pushed in a niche. With SaaS growing, the traditional server VM base will decline, and EDR won’t have a machine to put an agent on soon.
In the end, both of these technologies should fear log observability, micro-services, and containers.
The question is, how fewer servers are companies running on themselves on-premise or IaaS? Companies need to think about a strategy about an EDR being turned off, or an NDR circumvented. Composite sensing scenario and what happens when high tier adversary bypasses it? What is the fallback?
One trend that brought EDR to the stature of today is the cloud. There are no longer those network choke pointing ideal for NDR to monitor trafficking as in former networks. How will companies monitor their networks in a cloud environment where you can’t put online sensors?
Which of these have a higher chance to survive on the cloud? For now, they are copying their on-premise mentality into the cloud. As we move forward in the cloud, both of these technologies will be surpassed by in-app telemetry.
Cloud provider allows companies to beacon some network sensing capability at the network layer easily. The point is, is there a cloud-native feature that enables analysts to imitate netflow telemetry. And how easy is the integration with some cloud-based appliance?
Cost-wise, when on-premise, SOCs could buy and refresh network sensors. In the cloud, are there easy mechanisms to place the cost of the virtual network appliance over the SOC? What will be the costs of that? If analysts have to sniff high volume traffic in the cloud, it will induce high costs. There is a problem of the economic scale in this case.
SOC sustainability and automation
SOC structure needs to change…
If people think of SOC and Security operations as a function? How should we structure it?
Of course, traditionally, tier 1 would mean alert triage. Today? That’s not as absolute. Alert triage and the initial investigation are not tier 1 anymore. What is today’s proportion of those “tier 1” inside the whole SOC? Can people still call them tier 1, or do they need to change that terminology? Some organizations are rotating people in those functions or others have implemented an SRE logic among SOCs, meaning detection engineers develop solutions to solve detection challenges at scale. They also manage the alerts their solutions create.
One primary concern is how do we support young engineers? Do we put them in a way to be team members and not jeopardize more advanced functions? Historically that’s tier 1. But, if there’s not a permanent tier 1 anymore, how do companies accomplish that support?
This is the concern about the will to automate tier 1. Analysts could create playbooks for initial triage done by machines. Conceptually it works; it would allow every member of SOC the opportunity to make their lives better because of better alert de-duplication, enrichment, funneling, filtering, ML, and so on. It would decrease the number of people doing alert triage.
…to attract new talents and young engineers
Where do companies get senior people? What about career progression? How do you get to tier 2? Where do junior people come from? Where do you get talent if you’re not able to grow it?
Companies need to change the model because they can’t funnel enough people, and they need to use more inclusive language: making everyone in the SOC feel valued.
On that particular point. With the pandemic, companies started to hire people from anywhere, but they also witnessed that people are less accepting tedious jobs. If you give tier 1 analysts repetitive tasks, a monotonous routine, the turnover will be even higher. Companies have to build careers that are appealable to young people, and they have to figure out a way to hire junior people, give them experience, make them interested, make them want to grow and develop.
Ultimately, it goes back to the first point of the debate: team cohesion. How do people structure a SOC? There is a need to build a coherent team with a line of sight and a feedback loop. The last thing a SOC wants is to build walls between its different components.
Are robots going to replace, or to enhance humans?
Can automation entirely replace humans roles in SOC?
In the future, will cybersecurity be about our robots fighting their robots? What is the role of humans today?
There’s an analogy here. About IDS, why have so many people used Snort for so long? If analysts can run it properly, they could get the alert, the packet, and the signature. It gave the analyst a great deal of transparency and trust.
When we talk about ML/AI we need to think about what are we doing to support the same level of transparency and trust in the telemetry coming out of it? One solution is to learn the techniques as an analyst.
Are humans dead? The question is more about the proportion between humans and terabytes. More terabytes, and fewer humans.
Today, there is more data, more threats, more complicated environment. The things that need to be secured grow way faster than humans. Although full automation is impossible in the short term because of the cognitive characteristic of humans and difficult to imitate for a machine, the robot/human ratio needs to grow. It doesn’t mean that humans need to be replaced but that one human should be able to manage more terabytes.
But, there are some scenarios where there is a need to respond faster than human time. Some threats can only be dealt with by machines. Worms, for example. The reaction needs to be immediate. Humans can’t go hunt after worms.
What will be the role of humans in tomorrow’s SOC ?
Moreover, there are problems humans feel good at solving themselves and others with the help of machines. Determining which falls in what bucket is the right investment and the right factors of success to make ML/AI works. This is also why so many SIEM vendors lost contracts and bids when they were the incumbents. This is what the SOAR, ML, SIEM, big data platforms vendors need to bear in mind when they promise to enable their customers to achieve the successes shown in the demos at sales time