Not sure what’s going on in the last few weeks but the number of submissions on all of my sites has dropped. The drop isn’t drastic, but what’s more worrying is that the percentage of spam submissions has increased dramatically. I now end up rejecting 40 to 50% of submissions.
There seem to be a large number of sites/ blogs using free dvd converter packages for submission purposes. There are also an increasing number of other software download sites using similar programs to make submissions. Also, I’ve noticed a raft of new download sites being created with random names, like s666.info. This happened some years ago in the general web directory arena and resulted in the whole sector being devalued. I hope software download sites aren’t going the same way. Even if you aren’t worried too much about dubious submissions – content is content after all – since the majority of the submissions use very nearly the same text, it doesn’t take long to rack up a hundred or more pages with virtually identical content.
{ 5 } Comments
I don’t know whether I mentioned, I probably did.. but a few weeks ago I was doing some fixes to the softtester robot software after I was made aware that I was rejecting pad files which relied purely
on the new pad spec category.
At the time I was probably only accepting maybe 20 or 30 updates a day. Since the fixes, I get about 100 valid transactions a day. Apart from your black list I limit the amount of programs an author
can have and the amount of programs in a category.
“very nearly the same text”…
Do you mean the description text, that could be time consuming to do a text search on our databases…
The only thing I can think of to filter out spam is to store a crc number on the download file and only accept different crcs. But downloading all the programs each time would cause me problems.
I agree that this could cause problems.
Good point about the PAD spec. I don’t think this is the
problem – though I can’t be certain. I do get about 100
submissions a day on each of my sites. It’s just that a lot
are spam.
By similar, I mean that give or take a word, the titles
and descriptions and images are identical. They are
almost always very short descriptions too.
I have a query which shows 10 relevant programs at the bottom of my program listing page. Maybe deny if theres too many relevant programs ?
Not sure if it produces a number e.g. % relevance…
I actually get about 180 submissions a day, but about 80 are either spam emails or I reject to pad file problems.
Maybe deny short descriptions ?
Having said that maybe my progs might get rejected
It could be perfectly valid that there are lots of relevant
programs so I wouldn’t use that as a reason to reject.
If you have a list of submissions with titles etc. it soon
becomes easier to spot the crap. All I then do is hunt
down programs by the same authors or with similar titles
and delete the lot.
In my admin backend, I have a duplicate set of pages to
the main site, except that when the pages are accessed,
extra DELETE links appear, making it a doddle to delete
listings or everything by an author. For a while now, I’ve
been stuck at about 27000 listings. As soon as the
number gets to 27500 I seem to find a load of rubbish I
can delete. It’s weird how after a while something slips
you by and you discover 2% of your listings are
screensavers with almost identical names.
I really must write something in to scan with your black list against existing listings. Been on my to do list for quite some time.
Post a Comment