Posts 78

Comments 186

Sort

Type

ShareGPT @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

Tingling and palpitations spreading throughout the body

The tingling and palpitations that have been spreading throughout the body can be caused by several conditions. Here are some possibilities based on the search results:

Episodic Neurologic Symptoms: Patients with this condition usually complain of severe apprehension, dizziness, chest discomfort, palpitations, or fear of "losing control." The onset is usually under age 40[1].
Heart Attack or Panic Attack: Both conditions can cause chest pain, shortness of breath, dizziness, sweating, passing out, tingling, or a sensation of impending doom. These shared symptoms of heart and panic attack may also be caused by other serious conditions such as blood clots, lung infection or collapse, or tear in the large vessels of the chest for patients with certain pre-existing risk factors[2].
Atrial Fibrillation: The most obvious symptom of this condition is heart palpitations, where the heart feels like it's pounding, fluttering, or beating irregularly, often for a few seconds or possibly a few minutes. Other symptoms include tiredness, feeling lethargic, chest pain, dizziness, and shortness of breath[3][6].
Peripheral Neuropathy: Symptoms of this condition include numbness and tingling in the feet or hands, burning, stabbing or shooting pain in affected areas, loss of balance and coordination, and muscle weakness, especially in the feet. These symptoms are usually constant but may come and go[4].

It's important to see a doctor if you experience any of these symptoms to determine the underlying cause and receive appropriate treatment.

Citations: [1] https://www.ncbi.nlm.nih.gov/books/NBK374/ [2] https://www.cedars-sinai.org/blog/is-it-a-heart-attack-or-a-panic-attack.html [3] https://www.nhs.uk/conditions/atrial-fibrillation/symptoms/ [4] https://www.nhsinform.scot/illnesses-and-conditions/brain-nerves-and-spinal-cord/peripheral-neuropathy [5] https://www.webmd.com/heart-disease/heart-palpitations-emergency [6] https://www.hopkinsmedicine.org/health/conditions-and-diseases/atrial-fibrillation/afib-symptoms

Singularity | Artificial Intelligence (ai), Technology & Futurology @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

China Issues Rules for Generative AI, Mandating Adherence to 'Socialist Values' (article from 14.07.2023)

www.maginative.com China Issues Rules for Generative AI, Mandating Adherence to 'Socialist Values'

With the new rules, China seeks to strike a balance between advancing their AI capabilities while maintaining rigid ideological control.

12mo ago

How to download a website for offline browsing using wget?

There are several Linux command-line tools that can be used to download a mirror of a website for offline browsing. Here are some of the most popular ones:

wget: wget is a free utility for non-interactive download of files from the Web. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies. To download a website recursively, use the -r option. For example, the following command will download the entire website located at http://example.com:
```
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent http://example.com
```
This will create a local copy of the website in a directory named example.com.
httrack: httrack is a free and open-source website copier that allows you to download a website and browse it offline. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies. To download a website, use the -O option followed by the directory where you want to save the website. For example, the following command will download the entire website located at http://example.com:
```
httrack http://example.com -O /path/to/save/directory
```
This will create a local copy of the website in the directory /path/to/save/directory/example.com.
curl: curl is a command-line tool for transferring data from or to a server, using one of the supported protocols (HTTP, HTTPS, FTP, etc.). To download a website recursively, use the -r option. For example, the following command will download the entire website located at http://example.com:
```
curl -k -L -O -J -R -H 'Referer: http://example.com' -e 'http://example.com' -A 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36' -m 1800 --retry 3 --retry-delay 5 --retry-max-time 60 --retry-connrefused -s -S -L -r -k -J -O -C - http://example.com
```
This will create a local copy of the website in the current directory.

These tools have different options and features, so you should choose the one that best suits your needs.

Citations: [1] https://www.tutorialspoint.com/5-linux-command-line-based-tools-for-downloading-files-and-browsing-websites [2] https://www.kali.org/tools/httrack/ [3] https://medevel.com/os1-3-web-copier/ [4] http://www.linux-magazine.com/Online/Features/WebHTTrack-Website-Copier [5] https://winaero.com/make-offline-copy-of-a-site-with-wget-on-windows-and-linux/ [6] https://alvinalexander.com/linux-unix/how-to-make-offline-mirror-copy-website-with-wget

12mo ago

How to download a website for offline browsing using wget?

what are the best linux cli tools to download a mirror of a website for offline browsing?

12mo ago

How to download a website for offline browsing using wget?

wget -mkEpnp

wget --mirror --convert-links --adjust-extension --page-requisites --no-parent http://example.org

Explanation of the various flags:

--mirror – Makes (among other things) the download recursive.
--convert-links – convert all the links (also to stuff like CSS stylesheets) to relative, so it will be suitable for offline viewing.
--adjust-extension – Adds suitable extensions to filenames (html or css) depending on their content-type.
--page-requisites – Download things like CSS style-sheets and images required to properly display the page offline.
--no-parent – When recursing do not ascend to the parent directory. It useful for restricting the download to only a portion of the site.

wget -mpHkKEb -t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' http://www.example.com

–m (--mirror) : turn on options suitable for mirroring (infinite recursive download and timestamps).

-p (--page-requisites) : download all files that are necessary to properly display a given HTML page. This includes such things as inlined images, sounds, and referenced stylesheets.

-H (--span-hosts): enable spanning across hosts when doing recursive retrieving.

–k (--convert-links) : after the download, convert the links in document for local viewing.

-K (--backup-converted) : when converting a file, back up the original version with a .orig suffix. Affects the behavior of -N.

-E (--adjust-extension) : add the proper extension to the end of the file.

-b (--background) : go to background immediately after startup. If no output file is specified via the -o, output is redirected to wget-log.

-e (--execute) : execute command (robots=off).

-t number (--tries=number) : set number of tries to number.

-U (--user-agent) : identify as agent-string to the HTTP server. Some servers may ban you permanently for recursively download if you send the default User Agent.

Cronjobs

0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk -t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' https://example.com

0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' https://example.com

0 23 * * * cd ~/Documents/Webs/mirror; wget -mpkH t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' -D https://example.com

0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' https://example.com

0 23 * 1 * cd ~/Documents/Webs/mirror; wget -mpk t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' https://example.com

0 8 * * * pkill wget; cd ~/Documents/Webs/mirror/; rm wget*

ShareGPT @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

How to download a website for offline browsing using wget?

To download a website for offline browsing using wget, you can use the following command:

bash wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains example.com --no-parent http://www.example.com

Replace example.com with the website you want to download. This command will download the entire website, including all the elements that compose the page (images, CSS, etc.), and convert the links so that they work locally and offline[1].

Here's a brief explanation of the options used in the command:

--recursive: Download the entire website recursively.
--no-clobber: Don't overwrite existing files.
--page-requisites: Download all the elements required to display the page properly (images, CSS, etc.).
--html-extension: Save files with the .html extension.
--convert-links: Convert links so that they work locally and offline.
--restrict-file-names=windows: Modify filenames so that they work in Windows as well.
--domains example.com: Don't follow links outside the specified domain.
--no-parent: Don't follow links outside the specified directory.

If the website uses external resources like jQuery, you can try using the --span-hosts option to download resources from other domains as well:

bash wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains example.com --span-hosts --no-parent http://www.example.com

However, this may also download additional content from other domains that you might not need. To avoid downloading unnecessary content, you can manually download the required resources (like jQuery) and update the HTML files to use the local copies instead.

Keep in mind that some websites may have measures in place to prevent downloading their content using tools like wget. In such cases, you may need to adjust the command options or use alternative methods to download the website for offline browsing[6].

Citations: [1] https://www.linuxjournal.com/content/downloading-entire-web-site-wget [2] https://winaero.com/make-offline-copy-of-a-site-with-wget-on-windows-and-linux/amp/ [3] https://stackoverflow.com/questions/10842263/wget-download-for-offline-viewing-including-absolute-references [4] https://askubuntu.com/questions/391622/download-a-whole-website-with-wget-or-other-including-all-its-downloadable-con [5] https://superuser.com/questions/970323/using-wget-to-copy-website-with-proper-layout-for-offline-browsing [6] https://www.computerhope.com/unix/wget.htm [7] https://superuser.com/questions/1672776/download-whole-website-wget [8] https://gist.github.com/stvhwrd/985dedbe1d3329e68d70 [9] https://simpleit.rocks/linux/how-to-download-a-website-with-wget-the-right-way/ [10] https://www.guyrutenberg.com/2014/05/02/make-offline-mirror-of-a-site-using-wget/ [11] https://linuxreviews.org/Wget:_download_whole_or_parts_of_websites_with_ease [12] https://brain-dump.space/articles/how-to-get-full-offline-website-copy-using-wget-on-mac-os/ [13] https://dev.to/jjokah/how-to-download-an-entire-website-for-offline-usage-using-wget-2lli [14] https://alvinalexander.com/linux-unix/how-to-make-offline-mirror-copy-website-with-wget [15] https://askubuntu.com/questions/979655/using-wget-and-having-websites-working-properly-offline

Singularity | Artificial Intelligence (ai), Technology & Futurology @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

Mind blown 🤯🤯

Bing (multimodal) image input is free!

12mo ago

limit python requests to 1.39 requests per second

It isn't chatgpt. It's an LLM with search

Lemmy @lemmy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

Post title character limit is 201

I couldn't find this using the search.

12mo ago

limit python requests to 1.39 requests per second

To achieve a rate of 1.39 requests per second, you can use a global variable to keep track of the time elapsed between requests and then calculate the delay based on that time. Here's a modified version of your fetch_github_data function that implements this:

import time
import requests
import logging

PERSONAL_ACCESS_TOKEN = "your_personal_access_token"
DELAY = 1 / 1.39  # Calculate the delay for 1.39 requests per second
last_request_time = 0  # Initialize a global variable to store the last request time


def fetch_github_data(url):
    global last_request_time  # Access the global variable

    try:
        headers = {
            "Accept": "application/vnd.github+json",
            "Authorization": f"Bearer {PERSONAL_ACCESS_TOKEN}",
            "X-GitHub-Api-Version": "2022-11-28",
        }

        # Calculate the time elapsed since the last request
        time_elapsed = time.time() - last_request_time

        # Calculate the required delay based on the time elapsed
        required_delay = max(0, DELAY - time_elapsed)

        # Sleep for the required delay
        time.sleep(required_delay)

        response = requests.get(url, headers=headers)

        # Update the last request time
        last_request_time = time.time()

        logging.info(f"Fetched data from {url}")
        return response.json()
    except requests.exceptions.RequestException as e:
        logging.exception(f"Error fetching data from {url}\n{e}")
        raise

This code calculates the required delay based on the desired rate of 1.39 requests per second and the time elapsed since the last request. It then sleeps for the required delay before making the next request. The global variable last_request_time is used to keep track of the time of the last request.

Citations: [1] https://www.geeksforgeeks.org/how-to-add-time-delay-in-python/ [2] https://stackoverflow.com/questions/66229987/calculate-attempts-per-second [3] https://pypi.org/project/requests-ratelimiter/ [4] https://akshayranganath.github.io/Rate-Limiting-With-Python/ [5] https://stackoverflow.com/questions/32815451/are-global-variables-thread-safe-in-flask-how-do-i-share-data-between-requests [6] https://stackoverflow.com/questions/44014718/python-request-get-after-few-seconds [7] https://realpython.com/python-sleep/ [8] https://algotrading101.com/learn/yahoo-finance-api-guide/ [9] https://stackoverflow.com/questions/26098711/limiting-number-of-http-requests-per-second-on-python [10] https://realpython.com/python-use-global-variable-in-function/ [11] https://scrapeops.io/python-scrapy-playbook/scrapy-delay-between-requests/ [12] https://cloud.google.com/python/docs/reference/storage/1.44.0/client [13] https://github.com/JWCook/requests-ratelimiter [14] https://discuss.python.org/t/global-variables-shared-across-modules/16833 [15] https://coderslegacy.com/python/delay-between-requests-in-scrapy/ [16] https://jrnold.github.io/r4ds-exercise-solutions/transform.html [17] https://levelup.gitconnected.com/implement-rate-limiting-in-python-d4f86b09259f [18] https://docs.python.org/3/faq/programming.html [19] https://www.javatpoint.com/how-to-add-time-delay-in-python [20] https://koji.mbox.centos.org/koji/buildinfo?buildID=22406 [21] https://pypi.org/project/ratelimit/ [22] https://docs.python.org/3/library/timeit.html [23] https://www.purplefrogsystems.com/2020/07/how-to-delay-a-python-loop/ [24] https://medium.com/clover-platform-blog/conquering-api-rate-limiting-dcac5552714d [25] https://learning.postman.com/docs/writing-scripts/pre-request-scripts/ [26] https://python-forum.io/thread-35631.html [27] https://365datascience.com/tutorials/python-tutorials/limit-rate-requests-web-scraping/ [28] https://instructobit.com/tutorial/108/How-to-share-global-variables-between-files-in-Python [29] https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-latency [30] https://dev.to/astagi/rate-limiting-using-python-and-redis-58gk [31] https://www.simplilearn.com/tutorials/python-tutorial/global-variable-in-python

ShareGPT @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

limit python requests to 1.39 requests per second

To limit your Python requests to 1.39 requests per second, you can use the time.sleep() function from the time module to introduce a delay between requests. You can also use third-party libraries like pyrate-limiter, requests-ratelimiter, or ratelimiter to achieve rate limiting.

Here's an example using the time module:

```python import requests import time

urls = [...] # List of URLs to send requests to

for url in urls: response = requests.get(url) print(response.status_code) time.sleep(1 / 1.39) # Introduce a delay to achieve 1.39 requests per second ```

Alternatively, you can use the pyrate-limiter library:

```python from pyrate_limiter import Duration, RequestRate, Limiter import requests

rate_limit = RequestRate(1, Duration.SECOND / 1.39) # 1 request per 1.39 seconds limiter = Limiter(rate_limit)

urls = [...] # List of URLs to send requests to

for url in urls: with limiter: response = requests.get(url) print(response.status_code) ```

Another option is to use the requests-ratelimiter library:

```python from requests import Session from requests_ratelimiter import LimiterAdapter

session = Session()

Apply a rate limit of 1.39 requests per second to all requests

adapter = LimiterAdapter(per_second=1.39) session.mount('http://', adapter) session.mount('https://', adapter)

urls = [...] # List of URLs to send requests to

for url in urls: response = session.get(url) print(response.status_code) ```

These examples demonstrate different ways to limit your Python requests to 1.39 requests per second. Choose the method that best suits your needs and project requirements.

Citations: [1] https://stackoverflow.com/questions/26098711/limiting-number-of-http-requests-per-second-on-python [2] https://pypi.org/project/requests-ratelimiter/ [3] https://github.com/vutran1710/PyrateLimiter [4] https://akshayranganath.github.io/Rate-Limiting-With-Python/ [5] https://stackoverflow.com/questions/40748687/python-api-rate-limiting-how-to-limit-api-calls-globally [6] https://pypi.org/project/limiter/ [7] https://github.com/JWCook/requests-ratelimiter [8] https://levelup.gitconnected.com/implement-rate-limiting-in-python-d4f86b09259f [9] https://limits.readthedocs.io [10] https://github.com/encode/httpx/issues/815 [11] https://365datascience.com/tutorials/python-tutorials/limit-rate-requests-web-scraping/ [12] https://www.seelk.co/blog/efficient-client-side-handling-of-api-throttling-in-python-with-tenacity [13] https://www.cisco.com/c/en/us/support/docs/security/firepower-ngfw/217900-troubleshoot-firepower-threat-defense-an.html [14] https://scrapfly.io/blog/how-to-rate-limit-asynchronous-python-requests/ [15] https://dev.to/paymon123/the-easiest-way-to-rate-limit-a-python-api-3njc [16] https://cloud.google.com/python/docs/reference/storage/1.39.0/retry_timeout [17] https://medium.com/clover-platform-blog/conquering-api-rate-limiting-dcac5552714d [18] https://subscription.packtpub.com/book/web_development/9781838983994/9/ch09lvl1sec72/api-rate-limiting [19] https://towardsdatascience.com/speeding-up-python-code-fast-filtering-and-slow-loops-8e11a09a9c2f [20] https://katiekodes.com/python-wrap-requests-functions/ [21] https://www.reddit.com/r/Python/comments/12xahnb/i_built_a_simple_and_efficient_rate_limiter_for/ [22] https://docs.alcf.anl.gov/theta/performance-tools/craypat/ [23] https://coderpad.io/blog/development/a-guide-to-api-rate-limiting-in-django/

ShareGPT @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

How to avoid defining sqlite3 conn and cursor in many functions across the code?

To avoid defining the sqlite3.Connection and sqlite3.Cursor in many functions throughout your code, you can use the following approaches:

Use a context manager: Create a context manager that handles the connection and cursor creation, as well as closing the connection when done. This way, you can use the with statement to manage the connection and cursor in your functions.

```python import sqlite3

DB_FILE = "your_database_file.db"

class DatabaseConnection: def enter(self): self.conn = sqlite3.connect(DB_FILE) self.cursor = self.conn.cursor() return self.cursor

def exit(self, exc_type, exc_val, exc_tb): self.conn.commit() self.cursor.close() self.conn.close()

def insert_post_to_db(issue: Issue, lemmy_post_id: int) -> None: with DatabaseConnection() as cursor: cursor.execute( "INSERT INTO posts (issue_url, lemmy_post_id, issue_title, issue_body) VALUES (?, ?, ?, ?)", (issue.url, lemmy_post_id, issue.title, issue.formatted_body), ) ```

Use a function decorator: You can create a decorator that handles the connection and cursor creation and passes the cursor to the decorated function.

```python import sqlite3 from functools import wraps

DB_FILE = "your_database_file.db"

def with_cursor(func): @wraps(func) def wrapper(*args, **kwargs): conn = sqlite3.connect(DB_FILE) cursor = conn.cursor()

result = func(cursor, *args, **kwargs)

conn.commit() cursor.close() conn.close()

return result

return wrapper

@with_cursor def insert_post_to_db(cursor: sqlite3.Cursor, issue: Issue, lemmy_post_id: int) -> None: cursor.execute( "INSERT INTO posts (issue_url, lemmy_post_id, issue_title, issue_body) VALUES (?, ?, ?, ?)", (issue.url, lemmy_post_id, issue.title, issue.formatted_body), ) ```

Both of these approaches help you avoid defining the connection and cursor in every function that interacts with the database. The context manager approach is more explicit and easier to read, while the function decorator approach is more concise. Choose the one that best fits your coding style and project requirements.

Citations: [1] https://stackoverflow.com/questions/14511337/efficiency-of-reopening-sqlite-database-after-each-query [2] https://stackoverflow.com/questions/50075325/python-sqlite3-nested-cursor-execute [3] https://blog.udemy.com/python-sqlite/ [4] https://stackoverflow.com/questions/54395773/what-are-the-side-effects-of-reusing-a-sqlite3-cursor [5] https://pynative.com/python-sqlite/ [6] https://arctype.com/blog/guide-sqlite-python/ [7] https://sqlite.org/forum/info/4393a42b3b5e2382 [8] https://docs.python.org/3/library/sqlite3.html [9] https://www.reddit.com/r/learnpython/comments/94i4k9/using_a_global_sqlite_cursor_across_multiple/ [10] https://stackoverflow.com/questions/9561832/what-if-i-dont-close-the-database-connection-in-python-sqlite [11] https://climbtheladder.com/10-python-sqlite-best-practices/ [12] https://pypi.org/project/cuttlepool/ [13] https://www.sitepoint.com/sqlite-python/ [14] https://pyneng.readthedocs.io/en/latest/book/25_db/sqlite3.html [15] https://www.geeksforgeeks.org/python-sqlite-connecting-to-database/ [16] https://towardsdatascience.com/python-sqlite-tutorial-the-ultimate-guide-fdcb8d7a4f30 [17] https://codereview.stackexchange.com/questions/285730/simple-connection-pool-for-sqlite-in-python [18] https://developer.android.com/training/data-storage/sqlite [19] https://www.blog.pythonlibrary.org/2021/09/30/sqlite/ [20] https://www.digitalocean.com/community/tutorials/how-to-use-the-sqlite3-module-in-python-3 [21] https://developer.android.com/topic/performance/sqlite-performance-best-practices [22] https://www.reddit.com/r/learnpython/comments/8tkbor/how_does_sqlalchemy_connection_pooling_work_with/ [23] https://pymotw.com/2/sqlite3/ [24] https://vegibit.com/interact-with-databases-using-the-python-sqlite3-module/ [25] https://blog.rtwilson.com/a-python-sqlite3-context-manager-gotcha/ [26] https://remusao.github.io/posts/few-tips-sqlite-perf.html [27] https://www.digitalocean.com/community/tutorials/how-to-use-an-sqlite-database-in-a-flask-application [28] https://www.tutorialspoint.com/sqlite/sqlite_python.htm [29] https://www.sqlite.org/whentouse.html [30] https://rogerbinns.github.io/apsw/execution.html [31] https://stackoverflow.com/questions/42635749/sqlite-database-connection-best-practice [32] https://realpython.com/python-mysql/

Programming.dev Meta @programming.dev InternetPirate @lemmy.fmhy.ml 12mo ago

Are AI-Generated Solutions Welcomed on programming.dev?

I wanted to start a discussion about the use of AI-generated solutions on Programming.dev. Personally, I've found that AI-powered tools have been incredibly helpful in solving programming questions. I won't name any specific commercial software, but I use one that combines GPT-4 and web search to get more factual information. I write some answers I think I might revisit to the ShareGPT community, but I would prefer posting programming solutions to this instance. However, I'm not sure if AI-generated solutions are welcomed on programming.dev. I'd love to hear your thoughts on this. If AI-generated responses are accepted, how should we format the answers, should we just copy paste without quoting, should we quote the model, just mention that it's AI-generated,...?

12mo ago

[Solved] How would you debug this script without creating many posts?

Just change lemmy.post.create to lemmy.post.createe to trigger an AttributeError. That way you can debug the code without creating any posts. You can also use many print statements all around the code, I would use two for each line to make sure the computer isn't fooling you. Lastly, you can spin up your own Lemmy instance to not have to worry about the generated posts.

Lemmy Support @lemmy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

How to see the feed of another instance?

I'm wondering if it's possible to see the local feed of another instance from the one I'm using. I'm interested in exploring content from other instances without having to visit every single community, but I'm not sure how to do it. I've tried searching for a way to do this on the documentation and using the Lemmy search, but I haven't found any clear instructions. Does anyone know how to see the local feed of another instance? Any help or guidance would be greatly appreciated!

12mo ago

Does commenting on your own post bump it on the active filter view in Lemmy?

Testing.

https://join-lemmy.org/docs/users/03-votes-and-ranking.html

Edit: I was wrong the ranking that works like forums is New Comments and yes it seems to take into account the OP comments.

Lemmy Support @lemmy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

Does commenting on your own post bump it on the active filter view in Lemmy?

In Lemmy, the active filter view is designed to prioritize posts with the latest activity, similar to how forums work. However, it remains unclear whether commenting on your own post in Lemmy will bump it on the active filter view. Some forum platforms, such as Discourse, allow a practice known as the "ghost bump," where users can make a post and delete it to draw attention to their post without adding new content[^1]. While it is uncertain if this is possible on Lemmy, it's worth noting that even if it were, it would result in an unnecessary comment that cannot be completely removed. The comment would still be visible, indicating that it was deleted by the post's creator. If you have any experience with Lemmy's active filter view or know whether commenting on your own post bumps it, please share your thoughts in the comments below.

[^1]: What is "Bumping Topics"

Lemmy @lemmy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

Seeking a Forgejo-hosted Mirror for Lemmy Outside the US

As an enthusiastic supporter of Lemmy, I am eager to contribute to the project. However, I hold strong reservations about writing a single line of code for a project hosted on a Micro$oft server. While I have created a few issues on GitHub, I firmly believe that my contributions could be significantly amplified if there were a mirror of Lemmy that utilized Forgejo hosting outside the United States. I would be absolutely delighted to have the opportunity to contribute more actively to this incredible project if such an alternative hosting option were available.

12mo ago

Will Superintelligent AI End the World? | Eliezer Yudkowsky | TED (video 11.07.2023)

Permanently Deleted

Singularity | Artificial Intelligence (ai), Technology & Futurology @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

(Rumored Leak of) GPT-4 Architecture, Infrastructure, Training Dataset, Costs, Vision, MoE (11.07.2023)

threadreaderapp.com Thread by @Yampeleg on Thread Reader App

@Yampeleg: GPT-4's details are leaked. It is over. Everything is here: twitter.com/i/web/status/1… Parameters count: GPT-4 is more than 10x the size of GPT-3. We believe it has a total of ~1.8 trillion parameters ac...…

GPT-4's details are leaked.

It is over.

Everything is here: https://archive.is/2RQ8X

Parameters count:

GPT-4 is more than 10x the size of GPT-3. We believe it has a total of ~1.8 trillion parameters across 120 layers.

Mixture Of Experts - Confirmed.

OpenAI was able to keep costs reasonable by utilizing a mixture of experts (MoE) model. They utilizes 16 experts within their model, each is about ~111B parameters for MLP. 2 of these experts are routed to per forward pass.

MoE Routing:

While the literature talks a lot about advanced routing algorithms for choosing which experts to route each token to, OpenAI’s is allegedly quite simple, for the current GPT-4 model.

There roughly ~55B shared parameters for attention.

Inference:

Each forward pass inference (generation of 1 token) only utilizes ~280B parameters and ~560 TFLOPs. This contrasts with the ~1.8 trillion parameters and ~3,700 TFLOP that would be required per forward pass of a purely dense model.

Dataset:

GPT-4 is trained on ~13T tokens.

These are not unique tokens, they count the epochs as more tokens as well.

Epoch number: 2 epochs for text-based data and 4 for code-based data.

There is millions of rows of instruction fine-tuning data from ScaleAI & internally.

GPT-4 32K

There was an 8k context length (seqlen) for the pre-training phase. The 32k seqlen version of GPT-4 is based on fine-tuning of the 8k after the pre-training.

Batch Size:

The batch size was gradually ramped up over a number of days on the cluster, but by the end, OpenAI was using a batch size of 60 million! This, of course, is “only” a batch size of 7.5 million tokens per expert due to not every expert seeing all tokens.

For the real batch size:

Divide this number by the seq len to get the real batch size. just stop with this misleading numbers already.

Parallelism Strategies

To parallelize across all their A100s GPUs They utilized 8-way tensor parallelism as that is the limit for NVLink.

Beyond that, they are using 15-way pipeline parallelism.

(likely used ZeRo Stage 1. It is possible they used block-level FSDP)

Training Cost

OpenAI’s training FLOPS for GPT-4 is ~2.15e25, on ~25,000 A100s for 90 to 100 days at about 32% to 36% MFU.

Part of this extremely low utilization is due to an absurd number of failures requiring checkpoints that needed to be restarted from.

If their cost in the cloud was about $1 per A100 hour, the training costs for this run alone would be about $63 million.

(Today, the pre-training could be done with ~8,192 H100 in ~55 days for $21.5 million at $2 per H100 hour.)

Mixture of Expert Tradeoffs

There are multiple MoE tradeoffs taken: For example, MoE is incredibly difficult to deal with on inference because not every part of the model is utilized on every token generation.

This means parts may sit dormant when other parts are being used. When serving users, this really hurts utilization rates.

Researchers have shown that using 64 to 128 experts achieves better loss than 16 experts, but that’s purely research.

There are multiple reasons to go with fewer experts. One reason for OpenAI choosing 16 experts is because more experts are difficult to generalize at many tasks. More experts can also be more difficult to achieve convergence with.

With such a large training run, OpenAI instead chose to be more conservative on the number of experts.

GPT-4 Inference Cost

GPT-4 costs 3x that of the 175B parameter Davincci.

This is largely due to the larger clusters required for GPT-4 and much lower utilization achieved.

AN estimate of it's costs is $0.0049 cents per 1k tokens for 128 A100s to inference GPT-4 8k seqlen and $0.0021 cents per 1k tokens for 128 H100’s to inference GPT-4 8k seqlen. It should be noted, we assume decent high utilization, and keeping batch sizes high.

Multi-Query Attention

OpenAI are using MQA just like everybody else.

Because of that only 1 head is needed and memory capacity can be significantly reduced for the KV cache. Even then, the 32k seqlen GPT-4 definitely cannot run on 40GB A100s, and the 8k is capped on max bsz.

Continuous batching

OpenAI implements both variable batch sizes and continuous batching. This is so as to allow some level of maximum latency as well optimizing the inference costs.

Vision Multi-Modal

It is a separate vision encoder from the text encoder, with cross-attention. The architecture is similar to Flamingo. This adds more parameters on top of the 1.8T of GPT-4. It is fine-tuned with another ~2 trillion tokens, after the text only pre-training.

On the vision model, OpenAI wanted to train it from scratch, but it wasn’t mature enough, so they wanted to derisk it by starting with text.

One of the primary purposes of this vision capability is for autonomous agents able to read web pages and transcribe what’s in images and video.

Some of the data they train on is joint data (rendered LaTeX/text), screen shots of web page, youtube videos: sampling frames, and run Whisper around it to get transcript.

[Dont want to say "I told you so" but..]

Speculative Decoding

OpenAI might be using speculative decoding on GPT-4's inference. (not sure 100%)

The idea is to use a smaller faster model to decode several tokens in advance, and then feeds them into a large oracle model as a single batch.

If the small model was right about its predictions – the larger model agrees and we can decode several tokens in a single batch.

But if the larger model rejects the tokens predicted by the draft model then the rest of the batch is discarded. And we continue with the larger model.

The conspiracy theory that the new GPT-4 quality had been deteriorated might be simply because they are letting the oracle model accept lower probability sequences from the speculative decoding model.

Inference Architecture

The inference runs on a cluster of 128 GPUs.

There are multiple of these clusters in multiple datacenters in different locations.

It is done in 8-way tensor parallelism and 16-way pipeline parallelism.

Each node of 8 GPUs has only ~130B parameters, or… twitter.com/i/web/status/1…

The model has 120, so it fits in 15 different nodes. [Possibly the there are less layers on the first node since it needs to also compute the embeddings]

According to these numbers: OpenAI should have trained on 2x the tokens if they were trying to go by chinchilla's optimal.

[let alone surpass it like we do]

This goes to show that they are struggling to get high quality data. Why no FSDP?

A possible reason for this could be that some of the hardware infra they secured is of an older generation.

This is pretty common at local compute clusters as the organisation usually upgrade the infra in several "waves" to avoid a complete pause of operation.… twitter.com/i/web/status/1…

Dataset Mixture

They trained on 13T tokens.

CommonCrawl & RefinedWeb are both 5T.

Remove the duplication of tokens from multiple epochs and we get to a much reasonable number of "unaccounted for" tokens: The "secret" data.

Which by this point we already get rumors that parts of it came from twitter, reddit & youtube.

[Rumors that start to become lawsuits]

Some speculations are:

LibGen (4M+ books)
Sci-Hub (80M+ papers)
All of GitHub

My own opinion:

The missing dataset it a custom dataset of college textbooks collected by hand for as much courses as possible.

This is very easy to convert to txt file and than with self-instruct into instruction form.

This creates the "illusion" that GPT-4 "is smart" no matter who use it.

Computer scientist? sure! it can help you with your questions about P!=NP

Philosophy major? It can totally talk to you about epistemology.

Don't you see?

It was trained on the textbooks. It is so obvious.

There are also papers that try to extract by force memorized parts of books from GPT-4 to understand what it trained on.

There are some books it knows so well that it had seen them for sure.

Moreover, If i remember correctly: It even know the unique ids of project Euler exes.

12mo ago

Focused Transformer: Contrastive Training for Context Scaling - 256k context length AI

The paper actually demonstrates a 16-million context window with 92% accuracy. Most models can be retrained to have a 100k context window with over 92% accuracy, but the accuracy drops to 74% at 256k. The code has already been released on GitHub as well. I'm excited to see the development of 100k models using this method soon!

Singularity | Artificial Intelligence (ai), Technology & Futurology @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

Focused Transformer: Contrastive Training for Context Scaling - 256k context length AI

arxiv.org /abs/2307.03170

Summary:

> Focused Transformer: A new technique for long-context language modeling. The paper introduces Focused Transformer (FOT), a method that uses contrastive learning and external memory to improve the structure of the (key, value) space and extend the context length of transformer models. FOT can fine-tune existing large models without changing their architecture and achieve better performance on tasks that require long context. > > LONGLLAMA: Extending LLaMA’s context length with FOT. The paper demonstrates the application of FOT to fine-tune OpenLLaMA models, which are large language models with memory augmentation. The resulting models, called LONGLLAMAs, can handle a context length of up to 256k tokens and show improvements on few-shot learning tasks such as TREC and WebQS. > > Distraction issue: A key challenge for scaling context length. The paper identifies the distraction issue as a major obstacle for using large memory databases in multi-document scenarios. The distraction issue occurs when keys from irrelevant documents overlap with keys from relevant ones, making them hard to distinguish. FOT alleviates this issue by exposing the memory attention layer to both positive and negative examples during training.

ELI5

> Sure! Imagine you have a toy box with lots of toys inside. You want to find your favorite toy, but there are so many toys that it's hard to find it. The Focused Transformer is like a special helper that can look inside the toy box and find your favorite toy quickly, even if there are lots of other toys in the way. It does this by remembering which toys are important and which ones are not, so it can find the right toy faster. Does that make sense?

Implications

> The Focused Transformer (FOT) technique has the potential to improve the performance of language models by extending their context length. This means that the models can better understand and incorporate new information, even when it is spread across a large number of documents. The resulting LONGLLAMA models show significant improvements on tasks that require long-context modeling, such as retrieving information from large databases. This research could have implications for natural language processing, code generation, quantitative reasoning, and theorem proving, among other areas. It could also make it easier to fine-tune existing large-scale models to lengthen their effective context. Is there anything else you would like to know?

12mo ago

Where can I seek medical advice online?

You don't have any idea of how GPT works. Read about it and then we can talk.

12mo ago

Where can I seek medical advice online?

Comparing current LLMs with autocomplete is stupid. An autocomplete can't pass law or biology exams in the 90th percentile like GTP-4 can.

No Stupid Questions @lemmy.world InternetPirate @lemmy.fmhy.ml 12mo ago

Where can I seek medical advice online?

Recently, I found myself questioning the accuracy of a diagnosis provided by a doctor I visited. Surprisingly, an AI seemed to offer a more insightful assessment. However, I understand the importance of not solely relying on AI-generated information. With that in mind, I'm eager to discover a reputable online platform where I can seek medical advice. Ideally, I hope to find a community where I can obtain multiple opinions to make a more informed decision about my health. If anyone could recommend such a site, I would greatly appreciate it.

ShareGPT @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

Price of insulin in China, Cuba, Laos, North Korea, and Vietnam?

Information on the price of insulin in China, Cuba, Laos, North Korea, and Vietnam is not readily available in the search results. However, there is information on the cost of insulin in some other countries. According to the World Population Review, the cost of insulin varies widely by country, with the United States having the highest prices[1]. In 2018, RAND Corporation reported that the manufacturer price for any given type of insulin averaged five to ten times higher in the United States than in all other OECD countries[1]. Novo Nordisk, a major insulin manufacturer, has recently slashed the price of insulin in the United States[3].

A study published in PMC in 2019 examined the prices, availability, and affordability of insulin in 13 low-income and middle-income countries. The study found that median procurement prices of human insulins across the countries ranged from $4.56 to $5.30, with large price variations seen across the countries[4]. The highest prices were found in two Chinese provinces, with isophane human insulin costing $24.72 in Shaanxi, China[4].

There is also information on the cost of insulin in South Korea, which stands at an average of $10.30 per unit[2]. According to Volza's Global Import data, insulin syringe import shipments from South Korea stood at 1.6K, imported by 98 World Importers from 82 South Korea[5]. China is one of the top exporters of insulin syringes, with 2,155 shipments[6].

Citations:

[1] https://worldpopulationreview.com/country-rankings/cost-of-insulin-by-country

[2] https://wisevoter.com/country-rankings/cost-of-insulin-by-country/

[3] https://www.fiercepharma.com/pharma/novo-nordisk-follows-eli-lillys-lead-slashing-insulin-prices-us

[4] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6570978/

[5] https://www.volza.com/imports-global/global-import-data-of-insulin+syringe-from-south-korea

[6] https://www.volza.com/p/syringe-insulin/export/export-from-china/

World News @lemmy.world InternetPirate @lemmy.fmhy.ml 12mo ago

French Courts Are Giving Protesters INSANE Sentences

French courts have been imposing disproportionately severe sentences for minor offenses, including 10 months in prison for stealing a can of Red Bull and one year for a homeless boy with schizophrenia caught looting a luxury store. The overwhelmed courts rush cases, provide minimal time for defendants, and prioritize punishment under the instruction of the Justice Minister. Furthermore, the French government is censoring social media and justifying it by claiming to protect public order, but it infringes upon free speech and mirrors tactics used by authoritarian regimes. The justice system exhibits a double standard, favoring the privileged, and creates a class divide, leading to unrest. Ironically, the government compares itself to oppressive nations while undermining democratic principles.

World News @lemmy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

French Courts Are Giving Protesters INSANE Sentences

12mo ago

What would you do if you had access to a superintelligent AGI?

Hopefully there are some people more positive than that, willing to change society so AGI doesn't make most humans starve to death or be imprisoned.

Singularity | Artificial Intelligence (ai), Technology & Futurology @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

'Deepfake' Kanye Video Warns of Disinformation and Civil Unrest: 'AI Will Kill The Media Industry' (Yahoo article 05.07.2023)

finance.yahoo.com 'Deepfake' Kanye Video Warns of Disinformation and Civil Unrest: 'AI Will Kill The Media Industry'

A controversial short film is raising eyebrows and questions about the use of "deepfakes" and generative AI. Created in just 7 days on a budget of only $30, the AI-generated video set to the song "War With The Matrix" is a radical prediction of the impact of AI on media and society by acclaimed hip-...

[YANDHI - WAR WITH THE MATRIX (KANYE AI X BIG BABY GANDHI)](https://youtube.com/watch?v=CGyPqImBOjY

Singularity | Artificial Intelligence (ai), Technology & Futurology @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

ChatGPT Code Interpreter: Image to Video Prompt (Midjourney) (video 09.07.2023)

12mo ago

PSA: Lemmy votes can be manipulated

I feel like this is what happened when you’d see posts with hundreds / thousands of upvotes but had only 20-ish comments.

Nah it's the same here in Lemmy. It's because the algorithm only accounts for votes and not for user engagement.

12mo ago

YSK: Sorting by 'Hot' instead of 'Active' will show more diverse content on Lemmy

You can't claim it's different either, so? I'll still claim whatever the fuck I want.

12mo ago

YSK: Sorting by 'Hot' instead of 'Active' will show more diverse content on Lemmy

You can't have a source to Reddit's proprietary algorithms lol. Ask u/spez.

12mo ago

Using ChatGPT to turn almost anyone into a replica digital assistant (26.06.2023)

I personally prefer it colorful as it is, or even more colorful like Matt Wolfe's Midjourney generated thumbnails. It's a nice change of pace from the usual thumbnails.

12mo ago

YSK: Sorting by 'Hot' instead of 'Active' will show more diverse content on Lemmy

Top Hour is the same as Reddit's rising option.

12mo ago

What would you do if you had access to a superintelligent AGI?

I was thinking about this a few days ago. GANs and the Simulation Hypothesis: An AI Perspective

12mo ago

What would you do if you had access to a superintelligent AGI?

Locked in a room with an internet connection? A lot. But without any contact with the outside world? Not nearly as much. It could have other people running experiments for it with an internet connection, but not without one.

Anyway, whether or not the AGI can interact with the real world undermines the purpose of my explicit statement in the question. I specifically mentioned that it only operates as a human on a computer. I didn't mention it could acquire a physical body, so let's just assume it can't and can't use other people to do physical labor either.

12mo ago

What would you do if you had access to a superintelligent AGI?

I heard disruptive science is slowing down which I think means pretty much everything possible has already been thought of. So talking about things that exist, do you mean a cheaper solar panel or wind/water turbine? Or are we talking about science fiction like an Arc Reactor?

Asklemmy @lemmy.ml InternetPirate @lemmy.fmhy.ml 12mo ago

What would you do if you had access to a superintelligent AGI?

Imagine an AGI (Artificial General Intelligence) that could perform any task a human can do on a computer, but at a much faster pace. This AGI could create an operating system, produce a movie better than anything you've ever seen, and much more, all while being limited to SFW (Safe For Work) content. What are the first things you would ask it to do?

Apply a rate limit of 1.39 requests per second to all requests

YouTube Video

YouTube Video

YouTube Video