Data collected on me by ROKU

nimji · November 14, 2022, 6:29pm

The IP comment in the article really drives it home, even using say YouTube logged out with no Ad Personalization your IP is still a data point being stored for this type of cross marketing.

Thanks for sharing!

amarok · November 14, 2022, 7:04pm

Yes. It seems that even 24/7 VPN connection on all devices, and our many privacy-protecting countermeasures, aren’t enough when there are so many entities and services, including government agencies, happy to make your personal data available on the open market.

Gavaudan · November 14, 2022, 7:14pm

I wonder if it would help to vpn-server-hop from time to time?

amarok · November 14, 2022, 7:22pm

Probably, but if, like me, your TV/Roku streaming is not behind a VPN, then I would think it trivial for data exploiters to get your household location from, say, the internet provider and the Census bureau.

I don’t know if streaming channels would all work with a router-installed VPN service - probably most would not - but if so, it seems like it would be a hassle to frequently change the VPN server at the router for privacy considerations or in the event of slow servers. Plus, my other devices wouldn’t be able to use their individual VPN servers and different service providers.

It’s probably something I should look into, though.

Gavaudan · November 14, 2022, 8:06pm

I would think you could script it, but you’d still have to do at least a little research to see which servers are worth choosing from, particularly because you want speed for streaming. I dunno, maybe its not worth it, but interesting to think about at least.

amarok · November 14, 2022, 8:32pm

Apparently (some? most?) VPN client apps can change the server on the router from the desktop, according to my brief internet search, so maybe it’s not such a hassle. (I’ve never done a VPN router setup before.)

irvinewade · November 14, 2022, 10:29pm

I don’t think that this is necessarily correct.

For many customers the WAN side IP address is not static, and does change from time to time, in part due to the shortage of IPv4 addresses. (You may be able to force a change of IP address by doing something on your router. It is something that you could ask your ISP about i.e. the circumstances under which the IP address can be forced to change.)

For some customers, the IP address is not even public i.e. CGNAT is sitting between the customer and the internet, again due to the shortage of IPv4 addresses.

The publicly visible IP address is a relatively poor item to use for tracking, although it can form part of a fingerprint - since for example even if your IP address changes, you are generally still in the same country, still in the same city or region of your country and still with the same ISP.

(It isn’t entirely clear whether the IP address being discussed here is the private IP address of the TV or the WAN side IP address of your router, or something else.)

If you are talking about a smart TV running blackbox code then the IP address is the least of your worries. The TV can have (via some mechanism or other) a globally unique ID and simply provide that in any requests that can use tracking.

From a privacy perspective, smart TVs are best avoided. Instead connect a separate open source box via HDMI to your TV. That just moves the problem to a different box - but at least you have a fighting chance.

j_s · November 14, 2022, 10:52pm

After years of having the same IP address, my router died and when I connected a replacement I got a new IP address, but that might be because I was offline a few days. I think I left the modem powered on and that I had to power cycle it to get the new router to connect, but it has been long enough that I’ve forgottem. I do know that I have power cycled the router or the modem at various times without the IP address changing. Of course, each ISP might do things differently.

irvinewade · November 14, 2022, 11:20pm

Yes, that is always a possibility, which is why you might have to check with your ISP - and they might be unable to tell you or unwilling to tell you.

For all we know, it could depend on the WAN side MAC address of the router (which obviously would have changed when you replaced the router). In that respect it will also depend on what protocol is being used on the WAN side (which in turn depends on your internet connection technology).

In the good old days, routers were capable of cloning the WAN side MAC address - so that, for example, the new router could have the same MAC address as the old router (provided that the old router really did completely die) if keeping the same MAC address was a good thing. In these days of surveillance capitalism, keeping the same MAC address is unlikely to be a good thing.

That too is a possibility. I have heard of situations where ISPs have directed a customer to switch off the router and leave it off for X minutes.

Maybe the next time you go on holidays, you should switch off your router (all your network equipment) - and then see whether your IP address changes once you come back from holidays and switch back on.

especiallydirect · November 15, 2022, 1:33am

Have come, they say? As if it wasn’t already a thing by 2021?
I’ve been calling smart TVs “telescreens” since they’ve been around, but it’s probably fair for anyone to do so now.

amarok · November 15, 2022, 3:12pm

My current pet peeve is the usage of “smart speaker” to describe a too-smart listening device.

National Public Radio: “Ask your smart speaker to play NPR…” (so the manufacturer and we can surveil you 24/7).

amarok · December 24, 2022, 8:58pm

One annoying result of using Pi-hole to block Roku’s tracking is that the Roku will try repeatedly to make certain connections. One of the most annoying ones is scribe.logs.roku.com, which is prevented by the block-list I’ve loaded; however the Roku will keep trying it thousands of time in a day, even when the device is not in use… and it has no off switch. Funny, that.

So to avoid having to wade through 17,000+ blocked entries of scribe.logs.roku.com every time I peek at the activity in Pi-hole, I’m trying a solution I found in this tutorial by user “Anova3” in the Pi-hole forums: https://discourse.pi-hole.net/t/potential-work-around-for-spammy-devices-apps/56024/1

specifically this:

Add BLOCK_TTL=60 to your /etc/pihole/pihole-FTL.conf file.

That limits the TTL (“Time To Live”) for blocked requests, so they can’t keep attempting to connect too many times within a specified period. (I’m going with “3600” seconds, rather than “60,” though. I’ll report back here if that creates any problems with streaming.)

There are another couple of interesting options discussed in the above forum post, including adjusting the rate-limiting in Pi-hole’s DNS settings, e.g. maximum number of connections for a given period; rate-limiting is per client not ‘per domain’, by the way.

amarok · December 26, 2022, 4:51pm

…Aaaaaannnnd not one of the methods described above, or in Pi-hole’s documentation, seems to stop scribe.logs.roku.com from trying to connect repeatedly. The TTL setting doesn’t seem to work, either, at least for scribe.logs.roku.com.

It’s not escaping into the wild, fortunately, but it’s still annoying that it creates so much activity.

irvinewade · December 27, 2022, 2:29am

What IP address are you poisoning with (aka blockingmode)? 0.0.0.0? The IP address of Pi-hole itself? The loopback IP address? Something else?

Looks as if the options available are:

NULL (default and recommended)
IP-NODATA-AAAA
IP
NXDOMAIN
NODATA

Which option you choose will interact with caching on the DNS client and how (or whether) the configured TTL value is used. In particular, I doubt that the client will get the TTL if you choose NXDOMAIN or NODATA.

In any case, using nslookup -debug scribe.logs.roku.com.

will show you what TTL is being returned to the DNS client by Pi-hole (assuming that you are not returning different results to different clients).

You may need to include in the above nslookup command -query=A or -query=AAAA, trying each separately, if IPv6 is relevant on your network.

amarok · December 27, 2022, 3:26pm

I’ve tried the whitelist/reply to an IP just beyond my assignable range method.
I’ve tried blacklisting/adding NULL, NXDOMAIN, or NODATA to the pihole-FTL.log, along with BLOCK=3600 (or 600).

NULL is preferred, and it routes to 0.0.0.0. Using NULL shows up as “IP” as the reply in the activity list.

The best I’ve been able to achieve so far is 1 attempt per minute from scribe.logs.roku.com, as opposed to the more usual 1 attempt per 30 seconds, or sometimes even more often than that.

I’ll try setting the TTL and removing the line specifying a reply for blocked queries and see what that does.

veleno · December 27, 2022, 8:02pm

It’s a good reason to stop to watch tv and begin to read books

irvinewade · December 27, 2022, 11:30pm

If setting the TTL then you must use the above nslookup command to see what result you are actually getting.

I would examine the TTL that you are getting in at least the following two scenarios:

blockmode NULL (IP address is 0.0.0.0)
replying with a “valid” IP address on your LAN (that may or may not feature an actual server to respond to a connection)

The TTL would not even be sent in the NXDOMAIN or NODATA blockmode scenarios because there is nowhere to include it in the DNS response.

One possibility is that the Roku spybox is not doing DNS caching at all (and hence the TTL is irrelevant). So the Roku’s logic could be

do the DNS lookup
attempt to connect if the DNS lookup gave an IP address at all
fail to connect (which may involve a timeout on the connection)
sleep X seconds e.g. X=30
and then loops around to retry.

(If the Roku is doing DNS caching, are there any settings on the Roku to control that? In particular, since DNS provides no means of specifying a TTL for an NXDOMAIN response, the client has to have a default time to cache a “negative” response. So ideally you would use blockmode NXDOMAIN for this lookup and then set the Roku to use a large TTL for caching negative responses e.g. 1 hour.)

If you can’t tame the Roku then (apart from @veleno’s suggestion ) some options could be:

use a dedicated Pi-hole for the Roku (if your DHCP server supports that) and/or with
a dedicated subnet for the Roku
get a L2 filtering device and just block unwanted DNS lookups
modify the Pi-hole software so that it reduces the amount of duplicated logging of pointless DNS lookups (it’s open source, right?)
respond to the Roku DNS lookup with a valid IP address on your local LAN, with a valid server at that IP address that responds to the Roku’s requests but, as far as is possible, wastes the Roku’s time and/or tosses away whatever the Roku has to say (this of course implies an understanding of what type of server the Roku is attempting to connect to and what it is doing)

amarok · December 29, 2022, 1:13am

With the BLOCKINGMODE= line removed from the FTL config (which is the same as specifying NULL/0.0.0.0, and with BLOCK_TTL=3600 applied, I got:

(with scribe.logs.roku.com blacklisted)

nslookup -debug
> scribe.logs.roku.com
;; connection timed out; no servers could be reached

And the same response with scribe.logs.roku.com whitelisted with a reply=a real IP address on my network (which was the L5, powered off or powered on, no difference).

A random, normally whitelisted item returns something like this:

$ nslookup -debug
> mediaservices.cdn-apple.com        
Server:		1.1.1.1
Address:	1.1.1.1#53

------------
    QUESTIONS:
	mediaservices.cdn-apple.com, type = A, class = IN
    ANSWERS:
    ->  mediaservices.cdn-apple.com
	canonical name = mediaservices.cdn-apple.com.akadns.net.
	ttl = 3575
    ->  mediaservices.cdn-apple.com.akadns.net
	canonical name = mediaservices.cdn-apple.com.edgesuite.net.
	ttl = 35
    ->  mediaservices.cdn-apple.com.edgesuite.net
	canonical name = a1915.dscw154.akamai.net.
	ttl = 275
    ->  a1915.dscw154.akamai.net
	internet address = 23.72.90.136
	ttl = 11
    ->  a1915.dscw154.akamai.net
	internet address = 23.72.90.139
	ttl = 11
    AUTHORITY RECORDS:
    ADDITIONAL RECORDS:
------------
Non-authoritative answer:
mediaservices.cdn-apple.com	canonical name = mediaservices.cdn-apple.com.akadns.net.
mediaservices.cdn-apple.com.akadns.net	canonical name = mediaservices.cdn-apple.com.edgesuite.net.
mediaservices.cdn-apple.com.edgesuite.net	canonical name = a1915.dscw154.akamai.net.
Name:	a1915.dscw154.akamai.net
Address: 23.72.90.136
Name:	a1915.dscw154.akamai.net
Address: 23.72.90.139
------------
    QUESTIONS:
	a1915.dscw154.akamai.net, type = AAAA, class = IN
    ANSWERS:
    ->  a1915.dscw154.akamai.net
	has AAAA address 2600:1406:4200:3::1748:5a8b
	ttl = 20
    ->  a1915.dscw154.akamai.net
	has AAAA address 2600:1406:4200:3::1748:5a88
	ttl = 20
    AUTHORITY RECORDS:
    ADDITIONAL RECORDS:
------------
Name:	a1915.dscw154.akamai.net
Address: 2600:1406:4200:3::1748:5a8b
Name:	a1915.dscw154.akamai.net
Address: 2600:1406:4200:3::1748:5a88

I’m not sure what the significance of 1.1.1.1 (Cloudflare) is; my designated provider is Quad9.

The frequency of scribe.logs.roku.com changed from every 30 seconds when blocked to a 10second-then-50second repeating pattern when “whitelisted” with the reply=an IP address option.

OOPS:
Just realized that I needed to remove scribe.logs.roku.com from my router’s URL Filter list, so repeating the above (first blacklisted, then whitelisted):

$ nslookup -debug
> scribe.logs.roku.com
Server:		1.1.1.1
Address:	1.1.1.1#53

------------
    QUESTIONS:
	scribe.logs.roku.com, type = A, class = IN
    ANSWERS:
    ->  scribe.logs.roku.com
	internet address = 44.212.206.50
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 52.55.94.225
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 44.208.245.146
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 3.211.149.39
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 34.232.110.100
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 3.216.103.160
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 23.21.146.84
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 54.87.25.116
	ttl = 32
    AUTHORITY RECORDS:
    ADDITIONAL RECORDS:
------------
Non-authoritative answer:
Name:	scribe.logs.roku.com
Address: 44.212.206.50
Name:	scribe.logs.roku.com
Address: 52.55.94.225
Name:	scribe.logs.roku.com
Address: 44.208.245.146
Name:	scribe.logs.roku.com
Address: 3.211.149.39
Name:	scribe.logs.roku.com
Address: 34.232.110.100
Name:	scribe.logs.roku.com
Address: 3.216.103.160
Name:	scribe.logs.roku.com
Address: 23.21.146.84
Name:	scribe.logs.roku.com
Address: 54.87.25.116
------------
    QUESTIONS:
	scribe.logs.roku.com, type = AAAA, class = IN
    ANSWERS:
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf20:2042::12
	ttl = 82
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf21:2042::6
	ttl = 82
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf20:2042::a
	ttl = 82
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf21:2042::f
	ttl = 82
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf22:2042::e
	ttl = 82
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf21:2042::a
	ttl = 82
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf21:2042::8
	ttl = 82
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf22:2042::2
	ttl = 82
    AUTHORITY RECORDS:
    ADDITIONAL RECORDS:
------------
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf20:2042::12
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf21:2042::6
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf20:2042::a
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf21:2042::f
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf22:2042::e
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf21:2042::a
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf21:2042::8
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf22:2042::2
>

And:

$ nslookup -debug
> scribe.logs.roku.com
Server:		1.1.1.1
Address:	1.1.1.1#53

------------
    QUESTIONS:
	scribe.logs.roku.com, type = A, class = IN
    ANSWERS:
    ->  scribe.logs.roku.com
	internet address = 52.205.20.96
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 3.230.6.207
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 44.212.206.50
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 35.173.106.58
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 23.21.146.84
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 44.212.206.19
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 52.23.98.233
	ttl = 32
    ->  scribe.logs.roku.com
	internet address = 44.211.107.147
	ttl = 32
    AUTHORITY RECORDS:
    ADDITIONAL RECORDS:
------------
Non-authoritative answer:
Name:	scribe.logs.roku.com
Address: 52.205.20.96
Name:	scribe.logs.roku.com
Address: 3.230.6.207
Name:	scribe.logs.roku.com
Address: 44.212.206.50
Name:	scribe.logs.roku.com
Address: 35.173.106.58
Name:	scribe.logs.roku.com
Address: 23.21.146.84
Name:	scribe.logs.roku.com
Address: 44.212.206.19
Name:	scribe.logs.roku.com
Address: 52.23.98.233
Name:	scribe.logs.roku.com
Address: 44.211.107.147
------------
    QUESTIONS:
	scribe.logs.roku.com, type = AAAA, class = IN
    ANSWERS:
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf20:2042::2
	ttl = 88
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf20:2042::f
	ttl = 88
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf20:2042::11
	ttl = 88
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf21:2042::8
	ttl = 88
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf22:2042::c
	ttl = 88
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf22:2042::e
	ttl = 88
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf20:2042::d
	ttl = 88
    ->  scribe.logs.roku.com
	has AAAA address 2600:1f18:621f:bf20:2042::8
	ttl = 88
    AUTHORITY RECORDS:
    ADDITIONAL RECORDS:
------------
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf20:2042::2
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf20:2042::f
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf20:2042::11
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf21:2042::8
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf22:2042::c
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf22:2042::e
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf20:2042::d
Name:	scribe.logs.roku.com
Address: 2600:1f18:621f:bf20:2042::8
>

The Pi-hole on the Roku handles all my network DNS.

It may be open source, but I don’t see how its software could be modified by the user. (This user, anyway.)

At least I can just continue as before, albeit with tens of thousands redundant attempts clogging up the activity view.

irvinewade · December 29, 2022, 3:17am

I think the significance is: your testing is invalid.

Whatever computer you are typing those nslookup commands on is not using Pi-hole as the DNS server. Right? So you would need to sort that out first. (Which computer is this from?)

Note also that the domain name when I supplied it had a dot on the end but when you’ve typed it, you left the dot off. In this case I doubt it will make a difference but you should get into the habit of supplying the dot when specifically testing DNS lookups.

So you are running Pi-hole on the Roku itself? If so, I don’t think that that is very safe.

This might be a crazy suggestion but if you are messing around on the Roku itself, why not just nobble the troublesome domain in /etc/hosts and then it will never do a DNS lookup for that domain?

amarok · December 29, 2022, 4:42pm

I was ssh’ed into the Rpi from a laptop with a VPN activated; the VPN is supplying DNS through OpenNic, I think, but the Rpi, with Pi-hole installed, is not routed through the VPN.

All devices on my network use Pi-hole (installed on the Rpi) for DNS lookup, unless the device is connected through the VPN. In the Pi-hole settings panel, I have designated only Quad9 as the provider. Quad9 is also the designated provider on my router itself.

I’ve read that Rokus are generally thought to be predisposed to use Google’s DNS service and are sometimes difficult to force to use Pi-hole’s designated provider, although it’s possible to correct that, as I have done. The Pi-hole activity display indicates that each successful connection is routed through Quad9, and each unsuccessful connection is blocked.

Pi-hole is installed on the RaspberryPi. It’s not possible to install anything on a Roku (without hacking into it, if that’s even possible).

Essentially, I think the Pi-hole is working like the /etc/hosts file anyway. It’s just that certain blocked connections don’t like to hear “No!” and try relentlessly to connect anyway, even when they’re routed to 0.0.0.0.

P.S. IPV6 is disabled on my network, FYI.