Problem running new DVSwitch on same server as XLX, YSF, P25 or NXDN reflectors


David Young
 

Having problems running the new DVSwitch modules on same server as these reflectors.
XLX problem:  using buster repository and DVSwitch, XLX cannot open the key UDP ports as DVSwitch seems to be using them.  Un-installed DVSwitch and XLX then can open it's required ports.
YSF Reflector problem:  Again using buster repository and DVSwitch modules, randomly, reflector shuts down.  Sometimes runs for hours, sometimes for a day or two between failures.  YSFReflector log shows error message:  free(): double free detected in tcache 2.
P25 and NXDN Reflectors:  same as YSFReflector but do not see the same error message but the reflectors will shutdown and require to manually perform a reload then restart of the Reflector run scripts provided by G4KLX from his github link.
On P25 and NXDN reflectors have tried using DVSwitch-Server and then backed off that and am currently using just DVSwitch without the dashboard.
Ideas???
--
Dave WB6DTB


Komkit Listisard
 

Dave,  I am following this.  I run into the same YSFReflector throwing " free(): double free detected in tcache 2." very often.  I have no idea  what was the problem.

I opened the issue on G4KLX under YSFClients github as I have no idea where to ask about what this error is all about. If you would add your issue on the list, the issue might have a chance to get some attention.  At least I hope to know what the error mean.

I do not know if it was DVSwitch related or not,  I have 2 YSF reflectors running.  One with XLX and the other stand alone just YSFReflector only no DVSWitch and I do recall I have seen the same error on both reflectors.

For the XLX server I am using MMDVM_Bridge service only just to bridge DMR <-> YSF, I disabled everything else.  If I did not disable what I do not use on DVSWitch, they will be fighting over the ports.  

73, Kit


David Young
 

Hi Kit,
My guess, it is related to the new DVSwitch.  I also am running both YSF Reflector and XLX reflector on the same server.  XLX would not run as DVSwitch was opening some of the ports for itself that XLX needed.  I uninstalled DVSwitch which still allows the use of MMDVM_Bridge to bridge both YSFReflector and XLXReflector to a master hub, in my case hblink3.  The buster repository though is still installed on the server.  I have been running 3 different YSFReflectors in the past without that error ever occurring, all coexisting with the older stretch version of DVSwitch.  I think it is related to DVSwitch as I have also been having problems with P25 and NXDN Reflectors coexisting with the new DVSwitch buster version.  One time I did see that same double free detected error when my P25 Reflector shutdown.
I would like to go back to the stretch version of DVSwitch but that link is no longer available.  I was hoping Steve would bring back that link so we could test to see if the error message occurs now with the older DVSwitch stretch version.  If so, then we would feel better about posting to G4KLX concerning this problem.

--
Dave WB6DTB


Steve N4IRS
 

Dave,
If there is interaction it with XLX it's between XLX and ircDDBGateway. XLX and IRCDDBGateway will not run on the same host. I have multiple P25Reflectors (3) multiple NXDNReflectors (2) and multiple YSFReflectors (2) running with NXDNGateway, P25Gateway and YSFGateway. MMDVM_Bridge Analog_Bridge, Analog_Reflector. DVSwitch is a name of the overall project built out of the components. Most of the problems people have are in the port numbers. There is no significant difference between the "Stretch Version" and the "Buster Version" If You are having problems I would like to see the logs where the issues shows up.

Steve N4IRS    

On 3/28/21 5:44 PM, David Young wrote:
Hi Kit,
My guess, it is related to the new DVSwitch.  I also am running both YSF Reflector and XLX reflector on the same server.  XLX would not run as DVSwitch was opening some of the ports for itself that XLX needed.  I uninstalled DVSwitch which still allows the use of MMDVM_Bridge to bridge both YSFReflector and XLXReflector to a master hub, in my case hblink3.  The buster repository though is still installed on the server.  I have been running 3 different YSFReflectors in the past without that error ever occurring, all coexisting with the older stretch version of DVSwitch.  I think it is related to DVSwitch as I have also been having problems with P25 and NXDN Reflectors coexisting with the new DVSwitch buster version.  One time I did see that same double free detected error when my P25 Reflector shutdown.
I would like to go back to the stretch version of DVSwitch but that link is no longer available.  I was hoping Steve would bring back that link so we could test to see if the error message occurs now with the older DVSwitch stretch version.  If so, then we would feel better about posting to G4KLX concerning this problem.

--
Dave WB6DTB


David Young
 

Steve,
Ok, let me start with the YSF Reflector error.  Attached is the reflector log file from 3/27/21 which shows the error message occurring approximately 4 full screens from the top of the file.  I have 3 servers running XLX and YSF and bridges from them to our hblink3 hub.  All 3 YSF Reflectors at sometime from startup will fail with that double clear error.  Each reflector may run for 1 plus hours or more or may run for 1 to 3 days, but at some point this error occurs and they shutdown.  Using the latest YSFReflector code from github.
The P25 and NXDN reflector errors are somewhat similar in that they will startup and run and bridge to the master hub.  Again we are using the latest versions of reflector code from github (G4KLX).  I don't have any log files which show they stop working, but what seems to occur is at the same frequency as above for YSF reflector, they will stop responding.  I know that the Gateways are still working as restarting the gateway does not restart the connection to the master.  The MMDVM_Bridge is still running as well.  To get the reflector to start bridging again, I need to execute the P25Reflector.sh and do a 'reload' which stops the reflector and then do a start.  Sometimes on the NXDN reflector when it stops if I just do the script command "NXDNReflector.sh start /etc/NXDNReflector.ini &" I get a message that the reflector is already running.  But no connection can be made to the reflector, so I just do the reload and then start command and all is well again, so the gateway and MMDVM_Bridge do not appear to be the problem, only the reflectors.  Again we are running three sets of P25 and NXDN reflectors with bridging on three individual servers.  The reflector logs do not show any problem or error message.  I have tried both the DVSwitch installed gateway code and also the gateway code from github wihich is a later version by date, but neither makes any difference.  Also I have tried different versions of Analog_Bridge for the P25 and different versions of MMDVM_Bridge for both P25 and NXDN, but as above I don't think these affect the problem.  I can send my .ini files if you want to see them.  Sometimes either the P25 or NXDN bridge will instead of stopping will have a extremely long latency, which listening on the DMR side of the bridge will sound like a 78 speed record running at 33.3 speed.  Again, just by issuing a P25 or NXDN script reload and then restart this problem disappears until sometime later.  All this has happened since we decided to move all our reflectors and bridges to new cloud servers (cheaper cost).  Did not have these problems with the original servers which were using the older DVSwitch modules.  Now we are using the DVSwitch-Server modules (not saying that DVSwitch is the problem, just trying to figure out what has changed).  Using the same server provider, just downsizing from 4 core to 2 core with less hard drive space needed.
Also have noticed by looking at the server stats that when using Analog_Bridge version 1.6.1 that if I use it as the second Analog_Bridge needed for P25 to DMR (creating a separate executable and .ini file that my CPU stats go way up over 50%, so switching the second Analog_Bridge executable with version 1.4.2 does not cause the high CPU usage.  Also needed to turn off Analog_Bridge logging as I was getting continuous messages in the Analog_Bridge log file stating successful running of something (can't remember), but was filling up the allotted memory if I did not stop the logging.  
Sorry for the long book.
--
Dave WB6DTB


Steve N4IRS
 

General comments:
The gateway code installed is directly from G4LKX github repositories. The only difference beween the "old stretch and the "new buster" repositories is the words stretch and buster. Each of the programs have been brought forward.

I see this in the YSFReflector log:
M: 2021-03-28 01:39:02.130 Removing WB6DTB     (80.191.195.172:4260) disappeared
free(): double free detected in tcache 2
Lots of hits on this. For example <https://stackoverflow.com/questions/57616404/what-does-it-mean-double-free-detected-in-tcache-2-while-using-mpz>

I would like to see a sample of this:
Also needed to turn off Analog_Bridge logging as I was getting continuous messages in the Analog_Bridge log file stating successful running of something (can't remember), but was filling up the allotted memory if I did not stop the logging.

I'll have to test this:
Also have noticed by looking at the server stats that when using Analog_Bridge version 1.6.1 that if I use it as the second Analog_Bridge needed for P25 to DMR (creating a separate executable and .ini file that my CPU stats go way up over 50%, so switching the second Analog_Bridge executable with version 1.4.2 does not cause the high CPU usage.

Steve N4IRS

On 3/28/21 7:50 PM, David Young wrote:
Steve,
Ok, let me start with the YSF Reflector error.  Attached is the reflector log file from 3/27/21 which shows the error message occurring approximately 4 full screens from the top of the file.  I have 3 servers running XLX and YSF and bridges from them to our hblink3 hub.  All 3 YSF Reflectors at sometime from startup will fail with that double clear error.  Each reflector may run for 1 plus hours or more or may run for 1 to 3 days, but at some point this error occurs and they shutdown.  Using the latest YSFReflector code from github.
The P25 and NXDN reflector errors are somewhat similar in that they will startup and run and bridge to the master hub.  Again we are using the latest versions of reflector code from github (G4KLX).  I don't have any log files which show they stop working, but what seems to occur is at the same frequency as above for YSF reflector, they will stop responding.  I know that the Gateways are still working as restarting the gateway does not restart the connection to the master.  The MMDVM_Bridge is still running as well.  To get the reflector to start bridging again, I need to execute the P25Reflector.sh and do a 'reload' which stops the reflector and then do a start.  Sometimes on the NXDN reflector when it stops if I just do the script command "NXDNReflector.sh start /etc/NXDNReflector.ini &" I get a message that the reflector is already running.  But no connection can be made to the reflector, so I just do the reload and then start command and all is well again, so the gateway and MMDVM_Bridge do not appear to be the problem, only the reflectors.  Again we are running three sets of P25 and NXDN reflectors with bridging on three individual servers.  The reflector logs do not show any problem or error message.  I have tried both the DVSwitch installed gateway code and also the gateway code from github wihich is a later version by date, but neither makes any difference.  Also I have tried different versions of Analog_Bridge for the P25 and different versions of MMDVM_Bridge for both P25 and NXDN, but as above I don't think these affect the problem.  I can send my .ini files if you want to see them.  Sometimes either the P25 or NXDN bridge will instead of stopping will have a extremely long latency, which listening on the DMR side of the bridge will sound like a 78 speed record running at 33.3 speed.  Again, just by issuing a P25 or NXDN script reload and then restart this problem disappears until sometime later.  All this has happened since we decided to move all our reflectors and bridges to new cloud servers (cheaper cost).  Did not have these problems with the original servers which were using the older DVSwitch modules.  Now we are using the DVSwitch-Server modules (not saying that DVSwitch is the problem, just trying to figure out what has changed).  Using the same server provider, just downsizing from 4 core to 2 core with less hard drive space needed.
Also have noticed by looking at the server stats that when using Analog_Bridge version 1.6.1 that if I use it as the second Analog_Bridge needed for P25 to DMR (creating a separate executable and .ini file that my CPU stats go way up over 50%, so switching the second Analog_Bridge executable with version 1.4.2 does not cause the high CPU usage.  Also needed to turn off Analog_Bridge logging as I was getting continuous messages in the Analog_Bridge log file stating successful running of something (can't remember), but was filling up the allotted memory if I did not stop the logging.  
Sorry for the long book.
--
Dave WB6DTB


Komkit Listisard
 

I have the similar error as well.

One question tho,  I am just bridge XLX (DMR)  <-> YSF, I only need to install just MMDVM_Bridge correct?

73, Kit


M: 2021-03-28 02:38:09.323 Removing ▒▒i▒RV (144.121.98.161:32780) unlinked

free(): double free detected in tcache 2
I: 2021-03-28 02:52:28.148 Opening YS


Steve N4IRS
 

Correct

On 3/28/21 9:03 PM, Komkit Listisard via groups.io wrote:
I have the similar error as well.

One question tho,  I am just bridge XLX (DMR)  <-> YSF, I only need to install just MMDVM_Bridge correct?

73, Kit


M: 2021-03-28 02:38:09.323 Removing ▒▒i▒RV (144.121.98.161:32780) unlinked

free(): double free detected in tcache 2
I: 2021-03-28 02:52:28.148 Opening YS


David Young
 

Steve,
Ok, I put one server running P25 and NXDN with bridging to hblink3 master hub back to the stock versions of Analog_Bridge V161 and MMDVM_Bridge V163.  So far after turning on the Analog_Bridge logging to #2 I am not getting the redundant messages.  This could have been my mistake when initially setting up the bridge as I forgot that the one instance of Analog_Bridge auto starts and I probably started it a second time, or maybe one of the updates fixed this problem.
The high CPU usage usually takes almost an hour before it starts to jump up from 5%.  So you might let it go over night.
Here is an error message I am getting when starting the MMDVM_Bridge for NXDN which is the second bridge so it doesn't auto start and I start it manually.  See the MMDVM_Bridge log file, the last entry after successful connection to the master.
"E:  NXDN unknown tag (13) in TVL"  if I use MMDVM_Bridge V162, this error does not appear.  Don't know if this is important to the operation, as the bridge works ok.
--
Dave WB6DTB


Heiko DL1BZ
 

NXDN unknown tag (13) in TVL
same here since last version of mmdvm_bridge. Don't know if it's important. All works ok.

73 Heiko, DL1BZ


Mike Zingman - N4IRR
 

This is just an informational message.  The message is a result of adding a new event type to several of the modes. Please ignore it. I will fix the message in the next release. 


Alec-N1AJW
 

I found that the ircgateway was running which blocked xlx from running. and the large cpu increase.  my vm went to 100 percent for not sure how long was due to the quantar bridge.  Would it be possible to turn the Quantar bridge startup off by default and we can enable it if we use that mode??


Alec


Steve N4IRS
 

Yes, you can stop QB abd keep QB from starting at boot. I doubt you CPU going to 100% was caused by QB.

systemctl stop quantar_bridge
systemctl disable quantar_bridge


On 3/29/2021 9:33 AM, Alec-N1AJW wrote:
I found that the ircgateway was running which blocked xlx from running. and the large cpu increase.  my vm went to 100 percent for not sure how long was due to the quantar bridge.  Would it be possible to turn the Quantar bridge startup off by default and we can enable it if we use that mode??


Alec


Mike Zingman - N4IRR
 

However, it IS a known issue that ircddbgateway can runaway and use 100%cpu.  


 

On 30/3/21 12:33 am, Alec-N1AJW wrote:
I found that the ircgateway was running which blocked xlx from
running. and the large cpu increase.  my vm went to 100 percent for
not sure how long was due to the quantar bridge.  Would it be possible
to turn the Quantar bridge startup off by default and we can enable it
if we use that mode??
I've seen similar high CPU usage by Quantar_Bridge and have had to
disable it, since I don't use it.

--
73 de Tony VK3JED/VK3IRL
http://vkradio.com


 

On 30/3/21 12:37 am, Steve N4IRS wrote:
Yes, you can stop QB abd keep QB from starting at boot. I doubt you
CPU going to 100% was caused by QB.
I have seen it cause high CPU usage - not necessarily 100%, but at least
50%, confirmed by top.  In my case, disabling QB was the solution.

--
73 de Tony VK3JED/VK3IRL
http://vkradio.com


Steve N4IRS
 

Did you have anything running with P25? A bridge perhaps?

On 3/29/21 9:24 PM, Tony Langdon wrote:
On 30/3/21 12:37 am, Steve N4IRS wrote:
Yes, you can stop QB abd keep QB from starting at boot. I doubt you
CPU going to 100% was caused by QB.
I have seen it cause high CPU usage - not necessarily 100%, but at least
50%, confirmed by top.  In my case, disabling QB was the solution.


 

Yes, I do run a p25 bridge.

On 30/3/21 12:25 pm, Steve N4IRS wrote:
Did you have anything running with P25? A bridge perhaps?

On 3/29/21 9:24 PM, Tony Langdon wrote:
On 30/3/21 12:37 am, Steve N4IRS wrote:
Yes, you can stop QB abd keep QB from starting at boot. I doubt you
CPU going to 100% was caused by QB.
I have seen it cause high CPU usage - not necessarily 100%, but at least
50%, confirmed by top.  In my case, disabling QB was the solution.





--
73 de Tony VK3JED/VK3IRL
http://vkradio.com