In a search in the Factual edge database for a particular phone number, 4 listings appear; two with one company name and two listings with an outdated company name. The two listings with the incorrect company name were found in the edge database, but only one of the listings showed the ‘edge’ label and said ‘OOPS, LOOKS LIKE WE DON’T HAVE ENOUGH DATA YET TO VERIFY THIS RECORD.’ The other listing has no such wording on it.
I wanted to know about the difference between these two listings so as to determine if I should submit edits to further ‘close’ these listings, or if they were good and ‘dead’ in the edge database. I wanted to ensure that the edge listings wouldn’t resurface somehow, so I reached out to Factual support to ask them to explain the situation.
Eric Walker kindly replied to my inquiry (emphasis added) saying:
“So there are actually two things going on here. The first listing is in our Edge dataset and not in our Stable dataset due to lack of inputs supporting the record. The second listing does have enough inputs to make it to the Stable dataset but it is filtered out due to it’s low existence score. A record’s existence score is our calculated metric of whether a business exists and is still open. We only show records in our Stable dataset that have a score that is above a certain threshold.
All listings show up in the Edge dataset because it is a superset of the Stable dataset and it does not have any existence filtering. It is every record we have regardless of being open or closed, lots or little support. […] Most all of our customers only ask for the stable, existence-filtered records. So as long as any records you are concerned with don’t show up when you query the stable dataset, you should be good.”
Bradley Geilfuss also replied saying:
“To add some color to this, records are pushed to stable by way of significant, overlapping evidence of the data. So, for example, submitting data that includes a reference to a Yelp page or Facebook page will help push it from edge to stable. Existence is a machine learned calculation that considers references as well as the reliability of those references. E.g., an unreliable listing service won’t really affect existence, but a recently updated Yelp page would.”
Here is what the Factual website has to say about a listing’s existence score and how it affects a listing:
Existence is a machine-learned score, representing the likelihood that a place currently exists. Existence scores are a probability that the places exists, ranging from 0.0 (definitely does not exist) to 1.0 (definitely does).
A place may have a low existence score for many reasons (this is not a comprehensive list):
- There is evidence that the place has closed down
- It is a long time since any new reference has been made to a place
- The place opened very recently, and there is not enough direct or anecdotal data to verify that it actually exists yet
- The place is referenced on the internet, but not in a way that makes it clear that it is a bonafide business (example: a check-in report to “my couch”)
In regards to which ‘inputs’ or ‘evidence’, in terms of internet yellow page (IYP) venues, is best to submit to Factual as proof of a company’s proper or current name/address/phone number (NAP) data, I was told by support that the “most useful proof are things that aren’t easily spoofed. For example, a Yelp page.” In this way, Factual is relying on the vetting of company data provided by other IYP venues instead of using their own manpower to do the data vetting. Moz Local similarly relies on other venues for data verification. They use the data from a Google My Business listing or a Facebook page as a signal of a business’ data authenticity.