Looking for documentation? Check out our new learning center!
AlienVault v4.5 Released! Download here

suricata stops and starts

paul_psmithpaul_psmith Alien Embassador
Trying to get suricata working on two stand alone (not DB, FW, SRV) systems. Running 4.1.2 on them as well as the main AV server. This is all the open source versions.

Suricata will run on sensor 1 for 5 minutes or so, then crash. I get events in the server when it is running so I know it is ok that way. I can't seem to find anything in the log files though to tell me what is happening.

Sensor 1 has 2 dual core CPU and 6G of RAM. It is running two interfaces, one is copper ethernet at 1G and the other is a bonded fiber from a netowkr tap.

Sensor 2 has 1 dual core CPU and 2 G of ram. It shows suricata running, but none of the events are showing up in the server. I get Ossec events from it first thing in the morning after some cron jobs run, or after a reboot. After a reboot i get some snort/suricata events.

And yes, the sensor interfaces are separate from the management ones.

I've been fighting with these since the upgrade to 4.0 with snort being the problem child before which was why i switched to suricata. I managed to get suricata working, which I could not do with snort, but still having some of these issues.

Any debugging help would be appreciated. I will also continue to work on this myself.



Thanks!!

Best Answer

Answers

  • marcmunkmarcmunk Abducted By Aliens
    I'll start of with a silly question. Do you see anything in the syslog on your sensor server? Or perhaps Suricatas error log? Do suricata even run when you don't get any messages?
  • paul_psmithpaul_psmith Alien Embassador
    edited January 2013
    This hits the syslog right when the process dies.

    Jan  8 11:06:07 soknse08 kernel: [357337.756466] Detect1[22472]: segfault at 7fc259a76058 ip 00007fc27434f12e sp 00007fc271e03840 error 6 in libc-2.11.3.so[7fc2742da000+159000]
    Jan  8 11:08:51 soknse08 kernel: [357501.723089] Detect1[22767]: segfault at 7f726ca721a8 ip 00007f7279ff012e sp 00007f7277aa4810 error 6 in libc-2.11.3.so[7f7279f7b000+159000]

    adding this from the suricata start log:

    8/1/2013 -- 11:08:04 - <Warning> - [ERRCODE: SC_WARN_OUTDATED_LIBHTP(207)] - libhtp < 0.2.7 detected. Keyword http_raw_header will not be able to inspect response headers.



  • marcmunkmarcmunk Abducted By Aliens
    It seems your suricata arent running when you arent getting the data. In a way thats a good thing then we can focus on your sensor.

    How much can you test on your ossim server? Is it in production? Is it virtual so you can clone it and change the ip so we have somthing to work on without screwing up your production server? Same for your Sensor? Virtuel or not. I guess it's not in production with the errors you describe :)
  • paul_psmithpaul_psmith Alien Embassador
    my whole setup is essentially in production as I use it to try to watch things on the network. But it is for my own use as we do not have any requirements from the business units for this type of info. I can wipe and reload pretty much everything without any worries. I'm not concerned about loss of information.

    I own the systems so I can do whatever I want to them.

    They are not virtual.

  • marcmunkmarcmunk Abducted By Aliens
    Lets hold backon reinstalling anything just yet. If i understand you correct everything else works fine on your ossim machine right? All updates are applied on the sensor?
  • paul_psmithpaul_psmith Alien Embassador
    Thanks for the help. Yes. All updates are installed. But this is not the only problem. I have other forum posts about some of the other issues.

    I am also having problems with ntop and when i try to delete large numbers of events from the database, they don't seem to really get deleted as they still show in the SIEM. I get an error when I try to delete a large number, like all the events from a single IP or all events of a particular kind.

    I can delete smaller blocks of events, like under 200. But it still seems like not all of those events get deleted, unless it takes a really long time to do that process.

    I have about 146,000 events in the database right now. The main server has 2 8 core CPU's, 32G of RAM and 1.3 TB of raid 5. It is a Dell server.
  • marcmunkmarcmunk Abducted By Aliens
    So you have problems with both your sensor and your OSSIM server. I would do a backup of the db and other stuff and then do a reinstall from another ISO then the one you have used. Did you do a MD5 hash check on the install iso? Sorry i cant be of more help then that.
  • paul_psmithpaul_psmith Alien Embassador
    md5 matched on the iso. Is the new installer 4.1.2? I just downloaded it but not sure.

    Ah. Looks like it is since the md5 matches on both the one I have from before and this one.

    Like I said, I am not so worried about the data. I refresh all my Ossim installs every year or so as they seem to get mucked up as updates come out or because of other various problems.

    And since snort and suricata are not working consistently, all the data is pretty useless anyway.

    I'll try a full refresh and see what happens.

  • paul_psmithpaul_psmith Alien Embassador
    Well, something changed in the last few days and now ntop is working ok. Not sure what or why. I have not had the time to hand hold the systems for the last few days, so can't say for sure. i do know there were some funked cron files in the daily folder that were failing because they wanted to connect to SQL server on localhost. But since these were on a sensor only system, I just moved them out of the folder and the email messages I was getting on the corn fails went away. Maybe there was something there. But that was only on one sensor.

    Sigh. Just one thing after another. Seems like the way it is.

    I'll see if maybe suricata is more stable.
  • paul_psmithpaul_psmith Alien Embassador
    Ok. I did a full reinstall on a sensor on totally new hardware. Increased mem from 2 - 6, went from 2 cores to 4 cores. faster disk, etc.

    Sittl suricata would stop and start. So i did some more digging. Found some info about some tuning for suricata.
    In suricata-debian.yaml I uncommented and changed the max-pending-packets setting to 2048 and this seems to have helped out. So far it has been running for some time now.

    Another thing I noticed in the yaml file is that it is specifying to run pfring on the main admin interface eth0. Should I change this in the yaml file so it is on the sniffing interface? Also, this makes me wonder if ntop is also got pfring running on the wrong interface. Is there any way to verify and change it? I do get excessive amounts of dropped packets in ntop, even on a lower traffic sensor.

    This new sensor is averaging about 340Mbit/s.

    Thanks!
    P
  • paul_psmithpaul_psmith Alien Embassador
    Update as of today. Seems that suricata is now stable with the memory tuning settings in place. I was also getting a lot of errors from ntop and had to do some tuning there as well.

    However ntop is still reporting something like 1000% loss. I am pretty much giving up on ntop on this sensor as I think there is just way too much traffic and I think there is a very high loss from the span port and that is the real problem.

    But now I cant get ntop to stop getting restarted by the ossim-agent. I have unchecked it from the server GUI and also made sure it was disabled in ossim-setup, but it keeps restarting. I have renamed the executable to make it fail but that seems like a not so great idea.

    Thx
  • paul_psmithpaul_psmith Alien Embassador
    On a couple of fresh installs, on newer hardware, suricata seems to run ok. It can be stoppped and restarted fine. On an older sensor that has been upgraded from 4.0 to 4.1, anytime I run ossim-reconfig or stop and start suricata, it seems to go through this long cycle of stopping and starting. Eventually it manage to get started and stay started, but that seems to take quite a few times. I don;t have the time to watch it, but it does this for a while and then a few hours later it is ok.

This discussion has been closed.