Velbus Home Assistant missing module instances

I have a Velbus integration with Home Assistant which works perfect with limited number of modules. However when I connect to my actual installation (= 69 modules) I am missing a number of modules. Is any body else experiencing the same problem? I connect to velbus over Velserv and my own canBus interface (PIhat described elsewhere on this forum) but the same problem occurs when I connect directly on the PI usb with VMBUSB1 module.

My configuration

This configuration allows VelbusLink to act as a canBus sniffer with extensive timing and filtering options for diagnostics. Via this logging it is easy to see that a (re)load of the Velbus integration scans every module address followed by getting channel names (equivalent to VelbusLink scan). This causes extensive use of the canBus resulting in some delay in the modules response messages. Because the module scan sequence does not pause for results, the traffic and delay on de canBus increases significant with higher module numbers because multiple modules are trying to send theire response messages at the same time.

Logging at start of scan

Logging for module 200 (filter packet.address = 200)

This delay by itself should not be a problem. The same occurs when executing a scan in VelbusLink but VelbusLink handles this delay without problems. Could this be a problem for the Home Assistant integration ? Any other idea ? Is (Python ?) code of velbus integration somewhere available on git for better understanding of implementation ?

Any idea, any suggestion appreciated !

the problem is that we wait for a fixed max time.

can you provide me with the logs of the integration hen debugging is enabled?

Suggestion 1: restart (module) timeout when any message received from scanned module and optional stop timeout when config load complete.
Suggestion 2: set timeout to higher value
Suggestion 3: just keep fixed wait time but continue handling config messages even after timeout has occured. I have the impression that VelbusLink is doing that (the scan dialog is closed when scanning but the config tree continues to be updated long after the dialog is closed, also VelbusLink captures and handles config messages triggered from Home Assistant without a scan request in VelbusLink)

To my opinion suggestion 1 would be the preferred solution (fastest and avoids saturating the canBus). Setting a higher timeout value seems easiest but will slow down the load a bit especially in small installations with few modules (not really a big deal I assume). Suggestion 3 seems also ok, but I have no idea if this is easy to implement … and the risk still exist to saturate the canBus.

Thx for super fast response !

Where can I find the requested log file ?

sugestion 3 is hard to implement, as we wait until the loading is complete to advertise the entities to hass.

maybe option 1 is an option

1 Like

Option 1 is my preferred solution because it also avoids saturating the canBus … only … I hope it is not to much work for you.

1000 x thx in advance !

can you create a github issue about this?

Issue created :ok_hand:

I will be happy to contribute to the nice work !

1 Like

you would be really welcome to help out, i’m really limited by the time i can put into this project at the moment, so not sure when i will be able to start on it

1 Like

I was rather thinking about a financial contribution … because I am really a Windows guy (C++, C#, Visual Studio, TFS, …) and an embedded electronics hobbyist.
Still I would be happy to invest time in python/linux development if someone could give me a jump start in the development cycle / tools. Is it an option we can meet in person? But again this will take some of your time … just let me know if this is an option.
Another simple (temporary) option could be just to double the fixed timeout. This would also double the intialisation time but I could live with that.

I had a short look at the code and from my knowledge the problem could be solved by reorganising a little bit the code of async def scan(self) → None:

The code starts by sending the ModuleTypeRequest to all modules in burst (which already puts the canBus under stress) afterwards the total timeout is calculated from the number of modules.

I would propose to change the code to a more sequential approach
In semi code it would look like

for addr in range(1, 255)
send request module type (ModuleTypeRequestMessage(addr))
wait for type message received within short timeout
if received module type info
query module info and state
wait for loading module info
else
module is not available, just continue with next module

Although the sequential code looks slower at first sight I think it would actually be a lot faster. Because there is only a single ModuleTypeRequestMessage at a time we should get the response pretty fast (no saturated canBus). So in a very short timeout we can decide the presence of a module.

Does this make sense … or did I misunderstood the code?

1 Like

controller.log (10.9 KB)
handler.log (9.6 KB)
Hi, I implemented the sequential approach in the uploaded files ‘handler.log’ and ‘controller.log’ (I renamed them from *.py to *.log to be able to upload the files).
All proposed changes are marked with ‘lgor’. If you agree with new approach I could fork at git … but I don’t know how to commit, build, test … :frowning_face:

1 Like

Forked and ready for code review :wink:

1 Like

send me a pull-request :slight_smile:

Pull request created