Logstash 解析Json字符串,删除json嵌套字段

Stella981
• 阅读 1698

一、场景:此文以一个简单的json字符串文件为例,描述如何用logstash解析嵌套的json,并删除其中的某些字段

我们在linux中test.json的内容如下:

{"timestamp":"2018-08-02T14:42:50.084467+0800","flow_id":364959073190719,"in_iface":"eth1","event_type":"alert","src_ip":"10.0.0.4","src_port":80,"dest_ip":"10.0.0.5","dest_port":16781,"proto":"TCP","tx_id":0,"alert":{"action":"allowed","gid":1,"signature_id":2101201,"rev":10,"signature":"GPL WEB_SERVER 403 Forbidden","category":"Attempted Information Leak","severity":2},"http":{"hostname":"bapi.yahoo.com","url":"\/v1tns\/searchorderlist?_time=1533192163978","http_user_agent":"Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/67.0.3396.99 Safari\/537.36","xff":"39.106.108.38","http_content_type":"text\/html","http_method":"POST","protocol":"HTTP\/1.0","status":403,"length":568},"app_proto":"http","flow":{"pkts_toserver":5,"pkts_toclient":5,"bytes_toserver":1547,"bytes_toclient":1076,"start":"2018-08-02T14:42:50.082751+0800"}}

为了方便查看,formate后,为如下格式

{  
   "timestamp":"2018-08-02T14:42:50.084467+0800",
   "flow_id":364959073190719,
   "in_iface":"eth1",
   "event_type":"alert",
   "src_ip":"10.0.0.4",
   "src_port":80,
   "dest_ip":"10.0.0.5",
   "dest_port":16781,
   "proto":"TCP",
   "tx_id":0,
   "alert":{  
      "action":"allowed",
      "gid":1,
      "signature_id":2101201,
      "rev":10,
      "signature":"GPL WEB_SERVER 403 Forbidden",
      "category":"Attempted Information Leak",
      "severity":2
   },
   "http":{  
      "hostname":"bapi.yahoo.com",
      "url":"\/v1tns\/searchorderlist?_time=1533192163978",
      "http_user_agent":"Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/67.0.3396.99 Safari\/537.36",
      "xff":"39.106.108.38",
      "http_content_type":"text\/html",
      "http_method":"POST",
      "protocol":"HTTP\/1.0",
      "status":403,
      "length":568
   },
   "app_proto":"http",
   "flow":{  
      "pkts_toserver":5,
      "pkts_toclient":5,
      "bytes_toserver":1547,
      "bytes_toclient":1076,
      "start":"2018-08-02T14:42:50.082751+0800"
   }
}

二、目的: 我们需要解析这个json,并且删除json中**"src_ip"字段和"http下的hostname"**这个字段

我的配置文件如下:

input {
    file {
        path => "/usr/share/logstash/private.cond/nestjson.json"
        codec => "json"
        start_position => "beginning"
        sincedb_path => "/dev/null"
    }
}
filter {
    json {
       source => "message"
    }
    mutate {
      remove_field => ["src_ip","[http][hostname]"]
   }
}
output {
    stdout {
      codec => rubydebug
    }
}
注意第14行删除

注意第14行删除字段和嵌套字段的写法

运行logstash我们得到如下输出:

{
         "alert" => {
                 "gid" => 1,
                 "rev" => 10,
            "severity" => 2,
           "signature" => "GPL WEB_SERVER 403 Forbidden",
              "action" => "allowed",
        "signature_id" => 2101201,
            "category" => "Attempted Information Leak"
    },
          "http" => {
                 "protocol" => "HTTP/1.0",
        "http_content_type" => "text/html",
          "http_user_agent" => "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36",
              "http_method" => "POST",
                   "length" => 568,
                      "url" => "/v1tns/searchorderlist?_time=1533192163978",
                      "xff" => "39.106.108.38",
                   "status" => 403
    },
          "path" => "/usr/share/logstash/private.cond/test.json",
    "event_type" => "alert",
      "src_port" => 80,
     "dest_port" => 16781,
       "dest_ip" => "10.0.0.5",
         "proto" => "TCP",
       "flow_id" => 364959073190719,
         "tx_id" => 0,
      "@version" => "1",
      "in_iface" => "eth1",
     "timestamp" => "2018-08-02T14:42:50.084467+0800",
          "flow" => {
         "pkts_toserver" => 5,
         "pkts_toclient" => 5,
        "bytes_toserver" => 1547,
                 "start" => "2018-08-02T14:42:50.082751+0800",
        "bytes_toclient" => 1076
    },
          "host" => "elk",
     "app_proto" => "http",
    "@timestamp" => 2018-08-02T10:14:14.372Z
}

我们可以看到src_iphttp下的hostname已经被成功删除

点赞
收藏
评论区
推荐文章
blmius blmius
3年前
MySQL:[Err] 1292 - Incorrect datetime value: ‘0000-00-00 00:00:00‘ for column ‘CREATE_TIME‘ at row 1
文章目录问题用navicat导入数据时,报错:原因这是因为当前的MySQL不支持datetime为0的情况。解决修改sql\mode:sql\mode:SQLMode定义了MySQL应支持的SQL语法、数据校验等,这样可以更容易地在不同的环境中使用MySQL。全局s
Wesley13 Wesley13
3年前
java将前端的json数组字符串转换为列表
记录下在前端通过ajax提交了一个json数组的字符串,在后端如何转换为列表。前端数据转化与请求varcontracts{id:'1',name:'yanggb合同1'},{id:'2',name:'yanggb合同2'},{id:'3',name:'yang
皕杰报表之UUID
​在我们用皕杰报表工具设计填报报表时,如何在新增行里自动增加id呢?能新增整数排序id吗?目前可以在新增行里自动增加id,但只能用uuid函数增加UUID编码,不能新增整数排序id。uuid函数说明:获取一个UUID,可以在填报表中用来创建数据ID语法:uuid()或uuid(sep)参数说明:sep布尔值,生成的uuid中是否包含分隔符'',缺省为
待兔 待兔
4个月前
手写Java HashMap源码
HashMap的使用教程HashMap的使用教程HashMap的使用教程HashMap的使用教程HashMap的使用教程22
Jacquelyn38 Jacquelyn38
3年前
2020年前端实用代码段,为你的工作保驾护航
有空的时候,自己总结了几个代码段,在开发中也经常使用,谢谢。1、使用解构获取json数据let jsonData  id: 1,status: "OK",data: 'a', 'b';let  id, status, data: number   jsonData;console.log(id, status, number )
Wesley13 Wesley13
3年前
00:Java简单了解
浅谈Java之概述Java是SUN(StanfordUniversityNetwork),斯坦福大学网络公司)1995年推出的一门高级编程语言。Java是一种面向Internet的编程语言。随着Java技术在web方面的不断成熟,已经成为Web应用程序的首选开发语言。Java是简单易学,完全面向对象,安全可靠,与平台无关的编程语言。
Stella981 Stella981
3年前
Django中Admin中的一些参数配置
设置在列表中显示的字段,id为django模型默认的主键list_display('id','name','sex','profession','email','qq','phone','status','create_time')设置在列表可编辑字段list_editable
Stella981 Stella981
3年前
Gson之实例五
前面四篇博客基本上可以满足我们处理的绝大多数需求,但有时项目中对json有特殊的格式规定.比如下面的json串解析:{"tableName":"students","tableData":{"id":1,"name":"李坤","birthDay":"Jun 22, 2012 9:54:49 PM"},{"id":2,"name":"曹贵生"
Wesley13 Wesley13
3年前
MySQL部分从库上面因为大量的临时表tmp_table造成慢查询
背景描述Time:20190124T00:08:14.70572408:00User@Host:@Id:Schema:sentrymetaLast_errno:0Killed:0Query_time:0.315758Lock_
Python进阶者 Python进阶者
10个月前
Excel中这日期老是出来00:00:00,怎么用Pandas把这个去除
大家好,我是皮皮。一、前言前几天在Python白银交流群【上海新年人】问了一个Pandas数据筛选的问题。问题如下:这日期老是出来00:00:00,怎么把这个去除。二、实现过程后来【论草莓如何成为冻干莓】给了一个思路和代码如下:pd.toexcel之前把这